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Abstract 

Constraint programming (CP) has been used with great success to tackle a wide variety of 
constraint satisfaction problems which are computationally intractable in general. Global 
constraints are one of the important factors behind the success of CP. In this paper, we 
study a new global constraint, the multiset ordering constraint, which is shown to be useful 
in symmetry breaking and searching for leximin optimal solutions in CP. We propose efficient 
and effective filtering algorithms for propagating this global constraint. We show that the 
algorithms maintain generalised arc-consistency and we discuss possible extensions. We also 
consider alternative propagation methods based on existing constraints in CP toolkits. Our 
experimental results on a number of benchmark problems demonstrate that propagating the 
multiset ordering constraint via a dedicated algorithm can be very beneficial. 



1 Introduction 



Constraint satisfaction problems (CSPs) play an important role in various fields of computer 
science |Tsa93| and are ubiquitous in many real-life application areas such as production plan- 
ning, staff scheduling, resource allocation, circuit design, option trading, and DNA sequencing. 
In general, solving CSPs is NP-hard and so is computationally intractable |Mac77a] . Constraint 
programming (CP) provides a platform for solving CSPs |MS98| p^pt03| and has proven success- 
ful in many real-life applications [Wal96j [RosOO] [RvBWOG] despite this intractability. One of the 
jewels of CP is the notion of global (or non-binary) constraints. They encapsulate patterns that 
occur frequently in constraint models. Moreover, they contain specialised filtering algorithms 
for powerful constraint inference. Dedicated filtering algorithms for global constraints are vital 
for efficient and effective constraint solving. A number of such algorithms have been developed 
(see |BCR05j for examples). 
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In this paper, we study a new global constraint, the multiset ordering constraint, which 
ensures that the values taken by two vectors of variables, when viewed as multisets, are ordered. 
This constraint has applications in breaking row and column symmetry as well as in searching 
for leximin optimal solutions. We propose two different filtering algorithms for the multiset 
ordering (global) constraint. Whilst they both maintain generalised arc-consisteny, they differ 
in their complexity. The first algorithm MsetLeq runs in time that is in the number of variables 
(n) and in the number of distinct values (d) and is suitable when n is much bigger than d. 
Instead, the second algorithm is more suitable when we have large domains and runs in time 
0{nlog{n)) independent of d. We propose further algorithms by considering some extensions to 
MsetLeq. In particular, we show how we can identify entailment and obtain a filtering algorithm 
for the strict multiset ordering constraint. These algorithms are proven to maintain generalised 
arc-consistency. 

We consider alternative approaches to propagating the multiset ordering constraint by using 
existing constraints in CP toolkits. We evaluate our algorithms in contrast to the alternative 
approaches on a variety of representative problems in the context of symmetry breaking. The 
results demonstrate that our filtering algorithms are superior to the alternative approaches either 
in terms of pruning capabilities or in terms of computational times or both. We stress that the 
contribution of this paper is the study of the filtering algorithms for the multiset ordering 
constraint. Symmetry breaking is merely used to compare the efficiency of these propagators. 
A more in depth comparison of symmetry breaking methods awaits a separate study. Such 
a study would be interesting in its own right as multiset ordering constraints are one of the 
few methods for breaking symmetry which are not special cases of lexicographical ordering 
constraints |CLGR96] . Nevertheless, we provide experimental evidence to support the need of 
multiset ordering consraints in the context of symmetry breaking. 

The rest of the paper is organised as follows. After we give the necessary formal background 
in the next section, we present in Section [3] the utility of the multiset ordering constraint. 
In Section 01 we present our first filtering algorithm, prove that it maintains generalised arc- 
consistency, and discuss its complexity. Our second algorithm is introduced in Section [3 In 
Section [6l we extend our first algorithm to obtain an algorithm for the strict multiset ordering 
constraint and to detect entailment. Alternative propagation methods are discussed in Section 
[71 We demonstrate in Section [8] that decomposing a chain of multiset ordering constraints 
into multiset ordering constraints between adjacent or all pairs of vectors hinders constraint 
propagation. Computational results are presented in Section [9l Finally, we conclude and outline 
our plans for future work in Section [101 

2 Formal Background 

2.1 Constraint Satisfaction Problems And Constraint Programming 

A finite-domain constraint satisfaction problem (CSP) consists of: (i) a finite set of variables 
X\ (ii) for each variable X ^ X, a, finite set V^X) of values (its domain); (iii) and a finite set 
C of constraints on the variables, where each constraint c{Xi, . . . ,Xj) G C is defined over the 
variables Xi, . . . , Xj by a subset of T>{Xi) x • • • x D(Xj) giving the set of allowed combinations 
of values. That is, c is an n-ary relation. 

A variable assignment or instantiation is an assignment to a variable X of one of the values 
from 'D(X). Whilst a partial assignment yl to is an assignment to some but not all X £ X, a 
total assignment A to A" is an assignment to every X € Af . We use the notation A[S] to denote 
the projection of A on to the set of variables S. A (partial) assignment A to the set of variables 
T C ^ is consistent iff for all constraints c{Xi , . . . ,Xj) € C such that {Xi , . . . ,Xj} C T, we 
have . . . ,Xj}] G c{Xi, . . . ,Xj). A solution to the CSP is a consistent assignment to X. 

^Throughout, we will say assignment when we mean total assignment to the problem variables. 
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A CSP is said to be satisfiable if it has a solution; otherwise it is unsatisfiable. Typically, we are 
interested in finding one or all solutions, or an optimal solution given some objective function. 
In the presence of an objective function, a CSP is a constraint optimisation problem. 

Constraint Programming (CP) has been used with great success to solve CSPs. Recent years 
have witnessed the development of several CP systems [RvBWOH] . To solve a problem using 
CP, we need first to formulate it as a CSP by declaring the variables, their domains, as well 
as the constraints on the variables. This part of the problem solving is called modelling. In 
the following, we first introduce our notations and then briefly overview modelling and solving 
in CP. Since we compare our algorithms against the alternative approaches in the context of 
symmetry breaking, we also briefly review matrix modelling and index symmetry. 

2.2 Notation 

Throughout, we assume finite integer domains, which are totally ordered. The domain of a 
variable X is denoted by 'D(X), and the minimum and the maximum elements in this domain 
by min{X) and max{X). We use vars{c) to denote the set of variables constrained by constraint 
c. If a variable X has a singleton domain {v} we say that v is assigned to X and denotes this 
by X <— or simply say that X is assigned. If two variables X and X' are assigned the same 
value, then we write X = X', otherwise we write ^{X = X'). 

A one-dimensional matrix, or vector, is an ordered list of elements. We denote a vector of 
n variables as X = {Xq, . . . , and a vector of n integers as x = {xq, . . . , In either 

case, a sub-vector from index a to index b inclusive is denoted by the subscript a ^ b, such 
as: Xa^b- Unless otherwise stated, the indexing of vectors is from left to right, with being 
the most significant index, and the variables of a vector X are assumed to be disjoint and not 
repeated. The vector Xx^^d is the vector X with some Xi being assigned to d. The functions 
f loor(X) and ceiling(X) assign all the variables of X their minimum and maximum values, 
respectively. A vector x in the domain of X is designated by x € X. We write {x | C A x € 
X} to denote the set of vectors in the domain of X which satisfy condition C. A vector of 
variables is displayed by a vector of the domains of the corresponding variables. For instance, 
X = ({1, 3, 4}, {1, 2, 3, 4, 5}, {1, 2}}) denotes the vector of three variables whose domains are 
{1,3,4}, {1,2,3,4,5}, and {1,2}, respectively 

A set is an unordered list of elements in which repetition is not allowed. We denote a set of n 
elements as X = {xq, . . . , x„_i}. A multiset is an unordered list of elements in which repetition 
is allowed. We denote a multiset of n elements as x = {{xq, • • • ,x„_i}}. We write max(x) or 
max§xo, . . . , x„_i}} for the maximum element of a multiset x. By ignoring the order of elements 
in a vector, we can view a vector as a multiset. For example, the vector (0, 1, 0) can be viewed 
as the multiset {{1,0,0}}. We will abuse notation and write {{x}} or {{(xq,... ,x„_i)}} for the 
multiset view of the vector x = (xq, • • • , x„_i). 

An occurrence vector occ(x) associated with x is indexed in decreasing order of significance 
from the maximum max{[x}}- to the minimum mm§x}} value from the values in {[x}}-. The ith 
element of occ(x) is the number of occurrences of max^xj — z in ^x^. When comparing two 
occurrence vectors, we assume they start and end with the occurrence of the same value, adding 
leading/trailing zeroes as necessary. Finally, sort{x) is the vector obtained by sorting the values 
in X in non-increasing order. 

2.3 Search, Local Consistency and Propagation 

Solutions to CSPs are often found by searching systematically the space of partial assignments. 
A common search strategy is backtracking search. We traverse the search space in a depth- 
first manner and at each step extend a partial assignment by assigning a value to one more 
variable. If the extended assignment is consistent then one more variable is instantiated and so 
on. Otherwise, the variable is re-instantiated with another value. If none of the values in the 
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domain of the variable is consistent with the current partial assignment then one of the previous 
variable assignments is reconsidered. 

Backtracking search may be seen as a search tree traversal. Each node defines a partial 
assignment and each branch defines a variable assignment. A partial assignment is extended 
by branching from the corresponding node to one of its subtrees by assigning a value j to the 
next variable Xi from the current 'D(Xi). Upon backtracking, j is removed from This 
process is often called labelling. The order of the variables and values chosen for consideration 
can have a profound effect on the size of the search tree |HE80] . The order can be determined 
before search starts, in which case the labelling heuristic is static. If the next variable and/or 
value are determined during search then the labelling heuristic is dynamic. 

The size of the search tree of a CSP is in the worst case equal to the product of the domain 
sizes of all variables. It is thus too expensive in general to enumerate all possible assignments 
using a naive backtracking algorithm. Consequently, many CP solution methods are based on 
inference which reduces the problem to an equivalent (i.e. with the same solution set) but smaller 
problem. Since complete inference is too computationally expensive to be used in practice, 
inference methods are often incomplete and enforce local consistencies. A local consistency is 
a property of a CSP defined over "local" parts of the CSP, in other words defined over subsets 
of the variables and constraints of the CSP. The main idea is to remove from the domains of 
the variables the values that will not take part of any solution. Such values are said to be 
inconsistent. Inconsistent values can be detected by using a number of consistency properties. 

A common consistency property proposed in [Mac77b] is generalised arc-consistency. A 
constraint c is generalised arc- consistent (or GAC), written GAC(c), if and only if for every X € 
vars{c) and every v € T>{X), there is at least one assignment to vars{c) that assigns f to X and 
satisfies c. Values for variables other than X participating in such assignments are known as the 
support for the assignment of v to X. Generalised arc-consistency is established on a constraint 
c by removing elements from the domains of variables in vars{c) until the GAC property holds. 
For binary constraints, GAC is equivalent to arc- consistency (AC, see |Mac77a] ). Another useful 
local consistency is hound consistency that treats the domains of the variables as intervals. For 
integer variables, the values have a natural total order, therefore the domain can be represented 
by an interval whose lower bound is the minimum value and the upper bound is the maximum 
value in the domain. A constraint C is hound consistent (BC) iff for every variable, for its 
minimum (maximum) there exists a value for every other variable between its minimum and 
maximum that satisfies C [vHSD98] . 

We will compare local consistency properties applied to (sets of) logically equivalent con- 
straints, ci and C2. As in |DB97j . we say that a local consistency property $ on ci is as strong 
as ^ on C2 iff, given any domains, if $ holds on ci then ^ holds on C2; we say that $ on ci is 
strictly stronger than ^ on C2 iff ^ on ci is as strong as ^ on C2 but not vice versa. 

In a constraint program, searching for solutions is interleaved with local consistency as fol- 
lows. Local consistency is first enforced before search starts to preprocess the problem and 
prune subsequent search. It is then maintained dynamically at each node of the search tree 
with respect to the current variable assignment. In this way, the domains of the uninstantiated 
variables shrink and the search tree gets smaller. Whilst the process of maintaining local consis- 
tency over a CSP is known as propagation, the process of removing inconsistent values from the 
domains is known as pruning or filtering. For effective constraint solving, it is important that 
propagation removes efficiently as many inconsistent values as possible. Note that GAC is an 
important consistency property as it is the strongest filtering that be done by reasoning on only 
a single constraint at a time. Many global constraints in CP toolkits therefore encapsulate their 
own filtering algorithm which typically achieves GAC at a low cost by exploting the semantics of 
the constraint. As an example, Regin in |Reg94| gives a filtering algorithm for the all-different 
constraint which maintains GAC in time 0(n^'^) where n is the number of variables. 

The semantics of a constraint can help not only find supports and inconsistent values quickly 
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Figure 1: The matrix model of the sport scheduhng problem in |vHMPR99] . 



but also detect entailment and disentailment without having to do filtering. A constraint c is 
entailed if all assignments of values to vars{c) satisfy c. Similarly, a constraint c is disentailed 
when all assignments of values to vars{c) violate c. If a constraint in a CSP is detected to be 
entailed, it does not have to propagated in the future, and if it is detected to be disentailed then 
it is proven that the current CSP has no solution and we can backtrack. 

2.4 Modelling 

CP toolkits provide constructs for declaring the variables, their domains, as well as the con- 
straints between these variables of a CSP. They often contain a library of predefined constraints 
with a particular semantics that can be applied to sets of variables with varying arities and do- 
mains. For instance, all-different{[Xi, .., X3]) with D{Xi) = D{X2) = {1,2}, D{X^) = {1,2,3} 
is an instance of all-different{[Xi, . . . , Xn]) defined on three variables with the specified do- 
mains. It has the semantics that the variables involved take different values |Reg94| . The 
all-dijferent([Xi, . . . ,Xn]) constraint can be applied to any number of variables with any do- 
mains. Such constraints are often referred as global constraints. Beldiceanu has catalogued 
hundreds of global constraints, most of which are defined over finite domain variables [BCR05] . 
They permit the user to model a problem easily by compactly specifying common patterns that 
occur in many constraint models. They also provide solving advantages which we shall explain 
later. 

Since constraints provide a rich language, a number of alternative models will often exist, 
some of which will be more effective than others. However, one of the most common and 
effective modelling patterns in constraint programming is a matrix model. A matrix model is the 
formulation of a CSP with one or more matrices of decision variables (of one or more dimensions) 
[FFH+02b] . Matrix models are a natural way to represent problems that involve finding a 
function or a relation. We shall illustrate matrix models and the power of global constraints in 
modelling through the sport scheduling problem. This problem involves scheduling games 
between n teams over n — 1 weeks [vHMPR99]. Each week is divided into n/2 periods, and 
each period is divided into two slots. The team in the first slot plays at home, while the team 
in the second slot plays away. The goal is to find a schedule such that: (i) every team plays 
exactly once a week; (ii) every team plays against every other team; (iii) every team plays at 
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most twice in the same period over the tournament. Van Hentenryck et al. propose a model 
for this problem in ivHMPR99] . where they extend the problem with a "dummy" final week 
to make the problem more uniform. The model consists of two matrices: a 3-d matrix T of 
Veriods x Eweeks x Slots and a 2-d matrix G of Veriods x Weeks, where Veriods is the set of 
n/2 periods, Sweeks is the set of n extended weeks, Weeks is the set of n — 1 weeks, and Slots is 
the set of 2 slots. In T, weeks are extended to include the dummy week, and each element takes 
a value from {1, . . . ,n} expressing that a team plays in a particular week in a particular period, 
in the home or away slot. For the sake of simplicity, we will treat this matrix as 2-d where 
the rows represent the periods and the columns represent the extended weeks, and each entry 
of the matrix is a pair of variables. The elements of G takes values from {1, . . . ,n^}, and each 
element denotes a particular unique combination of home and away teams. More precisely, a 
game played between a home team h and an away team a is uniquely identified by {h — l)*n + a. 
(see Figure [IJ. 

Consider the columns of T which denote the (extended) weeks. The first set of constraints 
post all- different (global) constraints on the columns of T to enforce that each column is a 
permutation of 1 ... n. The second constraint is an all-different (global) constraint on G that 
enforces that all games must be different. Consider the rows of T which represent the periods. 
The third set of constraints post the global cardinality constraints {gcc) on the rows to ensure 
that each of 1 . . . n occur exactly twice in every row. The fourth set of constraints are called 
channelling constraints and are often used when multiple matrices are used to model the problem 
and they have to be linked together. In our case, the channelling constraints links a variable 
representing a game (Gij) with a variable representing the team playing home team {Tij^o) and 
the corresponding variable representing the away team (Tjj^i) such that Gjj = (Tjj.o — ^)*n + 
Tij^i. The final set of constraints will be discussed after giving an overview of symmetry in CP. 



2.5 Symmetry 

A symmetry is an intrinsic property of an object which is preserved under certain classes of 
transformations. For instance, rotating a chess board 90° gives us a board which is indistin- 
guishable from the original one. A CSP can have symmetries in the variables or domains or 
both which preserve satisfiability. In the presence of symmetry, any (partial) assignment can 
be transformed into a set of symmetrically equivalent assignments without affecting whether or 
not the original assignment satisfies the constraints. 

Symmetry in constraint programs increases the size of the search space. It is therefore 
important to prune symmetric states so as to improve the search efficiency. This process is 
referred to as symmetry breaking. One of the easiest and most efficient ways of symmetry 
breaking is adding symmetry breaking constraints Pug93 , [CLGR96] . These constraints impose 



an ordering on the symmetric objects. Among the set of symmetric assignments, only those that 
satisfy the ordering constraints are chosen for consideration during the process of search. For 
instance, in the matrix model of Figure [H any solution can be mapped to a symmetric solution 
by swapping any two teams (Tij^ and Tij^i). These solutions are essentially the same. We can 
add the set of constraints (5) in order to break such symmetry between the two teams and speed 
up search by avoiding visiting symmetric branches. 

A common pattern of symmetry in matrix models is that the rows and/or columns of a 2-d 
matrix represent indistinguishable objects. Consequently the rows and/or columns of an assign- 
ment can be swapped without affecting whether or not the assignment is a solution jFFH+n2al 



These are called row or column symmetry; the general term is index symmetry. For instance, in 
the matrix model of Figure [H the (extended) weeks over which the tournament is held, as well 
the periods of a week are indistinguishable. The rows and the columns of T and G are therefore 
symmetric. Note that we treat T as a 2-d matrix where the rows represent the periods and 
columns represent the (extended) weeks, and each entry of the matrix is a pair of variables. 
If every bijection on the values of an index is an index symmetry, then we say that the index 
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has total symmetry. If the first (resp. second) index of a 2-d matrix has total symmetry, we say 
that the matrix has total column symmetry (resp. total row symmetry). In many matrix models 
only a subset of the rows or columns are interchangeable. If the first (resp. second) index of a 
2-d matrix has partial symmetry, we say that the matrix has partial column symmetry (resp. 
partial row symmetryj^. There is one final case to consider: an index may have partial index 
symmetry on multiple subsets of its values. For example, a CSP may have a 2-d matrix for 
which rows 1, 2 and 3 are interchangeable and rows 5, 6 and 7 are interchangeable. This can 
occur on any or all of the indices. 

An n X m matrix with total row and column symmetry has n!m! symmetries, a number 
which increases super-exponentially. An effective way to deal with this class of symmetry is to 
use lexicographic ordering constraints. 

Definition 1 A strict lexicographic ordering x <iex V between two vectors of integers x = 
{xo,xi, . . . ,Xn~i) and y = {yo,yi, . . . ,yn-i) holds iff 3k < k < n such that Xi = yi for all 
< i < k and Xk < yk- 

The ordering can be weakened to include equality. 

Definition 2 Two vectors of integers x = {xq, xi, . . . , Xn-i) and y = {yo,yi, ■ ■ ■ ,yn-i) are 
lexicographically ordered x <iex V iff x <iex V or x = y. 

Given two vectors of variables X = (Xq, Xi, . . . , and Y = (Yq, Yi-, ■ ■ ■ , ^n-i), we write a 

lexicographic ordering constraint as X <iex Y and a strict lexicographic ordering constraint as 
X <iex Y- These constraints are satisfied by an assignment if the vectors x and y assigned to 
X and Y are ordered according to Definitions [2] and [U respectively. 

To deal with column (resp. row) symmetry, we can constrain the columns (resp. rows) to 
be non-decreasing as the value of the index increases. One way to achieve this is by imposing 
a lexicographic ordering constraint between adjacent columns (resp. rows). These constraints 
are consistent which means that they leave at least one assignment among the set of symmetric 
assignments. We can deal with row and column symmetry in a similar way by imposing a 
lexicographic ordering constraint between adjacent rows and columns simultaneously. Also such 
constraints are consistent. Even though these constraints may not eliminate all symmetry, they 
have been shown to be effective at removing many symmetries from the search spaces of many 
problems. If a matrix has only partial column (resp. partial row) symmetry then the symmetry 
can be broken by constraining the interchangeable columns (resp. rows) to be in lexicographically 
non-decreasing order. This can be achieved in a manner similar to that described above. The 
method also extends to matrices that have partial or total column symmetry together with 
partial or total row symmetry. Finally, if the columns and/or rows of a matrix have multiple 



partial symmetries than each can be broken in the manner just described FFH"'"02a| . 



3 The Multiset Ordering Constraint and Its Applications 

Multiset ordering is a total ordering on multisets. 

Definition 3 Strict multiset ordering x <m y between two multisets of integers x and y holds 
^ff: 

x={{}} A y^{{}} V 
max{x) < max{y) V 
(max(x) = max{y) Ax— §max(x)}} y — §max(y)])-) 



^Throughout, we will say row symmetry (resp. column symmetry) when we mean total row symmetry (resp. 
total column symmetry) to the problem variables. 
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That is, either x is empty and y is not, or the largest value in x is less than the largest value 
in y, or the largest values are the same and, if we eliminate one occurrence of the largest value 
from both x and y, the resulting two multisets are ordered. We can weaken the ordering to 
include multiset equality. 

Definition 4 Two multisets of integers x and y are multiset ordered x <m y ijf x y or 
X = y. 

Even though this ordering is defined on multisets, it may also be useful to order vectors by 
ignoring the positions but rather concentrating on the values taken by the variables. We can do 
this by treating a vector as a multiset. Given two vectors of variables X = {Xq,Xi, . . . , Xn-i) 
and Y = {Yq,Yi, . . . ,Yn-i), we write a multiset ordering constraint as X <m Y and a strict 
multiset ordering constraint as X <m Y. These constraints ensure that the vectors x and y 
assigned to X and Y, when viewed as multisets, are multiset ordered according to Definitions H] 
and[3l respectively. 



3.1 Breaking Index Symmetry 

One important application of the multiset ordering constraint is in breaking index symmetry 
|FHK+03) . IfX is an n by m matrix of decision variables, then we can break its column symmetry 
by imposing the constraints {Xi^o, . . . <m (-'^j+i.o, • • • j-'^i+i.m) for i G [0,n - 2], or for 

short Co <m Ci . . . <m Cn-i where Cj corresponds to the vector of variables {Xi^, . . . ,Xi^rn) 
which belong to the i^^ column of the matrix. Similarly we can break its row symmetry by 
imposing the constraints {Xqj, . . . , X^j) <m (-'^Oj+i) • • • ) Xn,j+i) for j € [0, m — 2], or for short 
Rq <m Ri ■ ■ ■ <m Rm-1 in which Rj corresponds to the variables {Xqj, . . . , Xnj) of the j*^ row. 
Such constraints are consistent symmetry breaking constraints. Note that when we have partial 
column (resp. row) symmetry, then the symmetry can be broken by imposing multiset ordering 
constraints on the symmetric columns (resp. rows) only. 

Whilst multiset ordering is a total ordering on multisets, it is not a total ordering on vectors. 
In fact, it is a preordering as it is not antisymmetric. Hence, each symmetry class may have 
more than one element where the rows (resp. columns) are multiset ordered. This does not 
however make lexicographic ordering constraints preferable over multiset ordering constraints 
in breaking row (resp. column) symmetry. The reason is that they are incomparable as they 
remove different symmetric assignments in an equivalence class [FHK"'"03] . 

One of the nice features of using multiset ordering for breaking index symmetry is that 
by constraining one dimension of the matrix, say the rows, to be multiset ordered, we do not 
distinguish the columns. We can still freely permute the columns, as multiset ordering the rows 
ignores positions and is invariant to column permutation. We can therefore consistently post 
multiset ordering constraints on the rows together with either multiset ordering or lexicographic 
ordering constraints on the columns when we have both row and column symmetry. Neither 
approach may eliminate all symmetries, however they are all potentially interesting. Since 
lexicographic ordering and multiset ordering constraints are incomparable, imposing one ordering 
in one dimension and the other ordering in the other dimension of a matrix is also incomparable 
to imposing the same ordering on both dimensions of the matrix [FHK"'"03] . Studying the 
effectiveness of all these different methods in reducing index symmetry is outside the scope 
of this paper as we only focus on the design of efficient and effective filtering algorithms for 
the multiset ordering constraints. Nevertheless, experimental results in Section [9] show that 
exploiting both multiset ordering and lexicographic ordering constraints can be very effective in 
breaking index symmetry. 

A multiset ordering constraint can also be helpful for implementing other constraints useful 
to break index symmetry. One such constraint is allperm |F JM03j . Experimental results in 
[FJM03| show that the decomposition of allperm using a multiset ordering constraint can be as 
effective and efficient as the specialised algorithm proposed. 
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3.2 Searching for Leximin Optimal Solutions 

Another interesting application of the multiset ordering constraint arises in the context of search- 
ing for leximin optimal solutions. Such solutions can be useful in fuzzy CSPs. A fuzzy constraint 
associates a degree of satisfaction to an assignment tuple for the variables it constrains. To com- 
bine degrees of satisfaction, we can use a combination operator like the minimum function. 
Unfortunately, the minimum function may cause a drowning effect when one poorly satisfied 
constraint 'drowns' many highly satisfied constraints. One solution is to collect a vector of de- 
grees of satisfaction, sort these values in ascending order and compare them lexicographically. 
This leximin combination operator identifies the assignment that violates the fewest constraints 
[Far94] . This induces an ordering identical to the multiset ordering except that the lower ele- 
ments of the satisfaction scale are the more significant. It is simple to modify a multiset ordering 
constraint to consider the values in a reverse order. To solve such leximin fuzzy CSPs, we can 
then use branch and bound, adding a multiset ordering constraint when we find a solution to 
ensure that future solutions are greater in the leximin ordering. 

Leximin optimal solutions can be useful also in other domains. For instance, as shown in 
[blot], they can be exploited as a fairness and pareto optimality criterion when solving multi- 
objective problems in CP. Experimental results in [B LOTj show that using a multiset ordering 
constraint in a branch and bound search can be competitive with the alternative approaches to 
finding leximin optimal solutions. 

4 A Filtering Algorithm for Multiset Ordering Constraint 

In this section, we present our first filtering algorithm which either detects that X <m Y is 
disentailed or prunes inconsistent values so as to achieve GAC on X <m Y. After sketching 
the main features of the algorithm on a running example in Section 14.11 we first present the 
theoretical results that the algorithm exploits in Section 14.21 and then give the details of the 
algorithm in Section [4.31 Throughout, we assume that the variables of the vectors X and Y are 
disjoint. 

4.1 A Worked Example 

The key idea behind the algorithm is to build a pair of occurrence vectors associated with 
floor(Ar) and ceiling(y). The algorithm goes through every variable of X and Y check- 
ing for support for values in the domains. It suffices to have occ{floor{Xxi^max{Xi))) ^lex 
occ(ceiling(y)) to ensure that all values of are consistent. Similarly, we only need 

occ(f loor(X)) <i(,x occ{ce±l±Tig{YY.^min(Y.j))) to hold for the values of T>{Yj) to be consis- 
tent. We can avoid the repeated construction and traversal of these vectors by building, once 
and for all, the vectors occ(f loor(X)) and occ(ceiling(y)), and defining some pointers and 
flags on them. For instance, assume we have occ(f loor(X)) <iex occ(ceiling(y)). The vector 
occ{floor{Xxt'^max{Xt))) Can be obtained from occ(f loor(X)) by decreasing the number of 
occurrences of min{Xi) by 1, and increasing the number of occurrences of max{Xi) by 1. The 
pointers and flags tell us whether this disturbs the lexicographic ordering, and if so they help 
us to find quickly the largest max{Xi) which does not. 

Consider the multiset ordering constraint X <m Y where: 

X = ({5}, {4,5}, {3,4,5}, {2,4}, {!}, {!}) 
Y = ({4,5}, {4}, {1,2,3,4}, {2,3}, {!}, {0}) 

We have f loor(X) = (5, 4, 3, 2, 1, 1) and ceiling(y) = (5, 4, 4, 3, 1, 0). We construct our occur- 
rence vectors dx = occ(f loor(X)) and 6y = occ(ceiling(y)), indexed from max({{ceiling(X)}}U 
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{{ceiling(y)}}) = 5 to min({{f loor(X)}} U {{f loor(y)}}) = 0: 



5 4 3 2 1 
ox = (1, 1, 1, 1, 2, 0) 
oy = (1, 2, 1, 0, 1, 1) 

Recall that oxi and oy^ denote the number of occurrences of the value i in {{f loor(X)}} and 
•g[ceiling(y)J, respectively. For example, 01/4 = 2 as 4 occurs twice in {{ceiling(y)]5'- Next, 
we define our pointers and flags on ox and 6y. The pointer a points to the most significant 
index above which the values are pairwise equal and at a we have ox a < oya- This means that 
we will fail to find support if any of the Xi is assigned a new value greater than a, but we will 
always find support for values less than a. If ox = 6y then we set a = —00. Otherwise, we 
fail immediately because no value for any variable can have support. We define /9 as the most 
significant index below a such that oxp > oyp. This means that we might fail to find support if 
any of the Yj is assigned a new value less than or equal to /?, but we will always find support 
for values larger than (3. If such an index does not exist then we set /? = —00. Finally, the flag 
7 is true iff /3 = a — 1 or 0Xq,_|_i_>^_i = 6y^j^i^p_i, and a is true iff the subvcctors below [3 are 
ordered lexicographically the wrong way. In our example, a = 4, /? = 2, 7 = true, and a = true: 





5 4 


3 


2 


1 





ox = 


(1, 1, 


1, 


1, 


2, 


0) 


oy = 


(1, 2, 


1, 


0, 


1, 


1) 




at 


7 = true 


/3T 


a = true 





We now go through each and find the largest value in its domain which is supported. If 
Xi has a singleton domain then we skip it because we have ox <iex oy^ meaning that its only 
value has support. Consider Xi. As min{Xi) = a, changing o~x to occ{flooi:{Xxi^max{Xi))) 
increases the number of occurrences of an index above a by 1. This upsets ox <iex oy- We 
therefore prune all values in V{Xi) larger than a. Now consider X2. We have max{X2) > a 
and min{X2) < a. As with Xi, any value of X-2 larger than a upsets the lexicographic ordering, 
but any value less than a guarantees the lexicographic ordering. The question is whether a 
has any support? Changing ox to occ(f loor(Xx2^«)) decreases the number of occurrences of 
3 in ox by 1, and increases the number of occurrences of a by 1. Now we have ox a = oya but 
decreasing an entry in ox between a and (3 guarantees lexicographic ordering. We therefore 
prune from T>{X2) only the values greater than a. Now consider X^. We have max{X^) = a 
and min{X^) < a. Any value less than a has support but does a have any support? Changing 
o~x to occ(floox(Xx3^a)) decreases the number of occurrences of beta in ox by 1, and increases 
the number of occurrences of a by 1. Now we have ox a = oya and oxp = oyp. Since 7 and a are 
true, the occurrence vectors are lexicographically ordered the wrong way. We therefore prune a 
from V{X^). We skip X4 and X5. 

Similarly, we go through each Yj and find the smallest value in its domain which is supported. 
If Yj has a singleton domain then we skip it because we have ox <iex oy, meaning that its only 
value has support. Consider Yq. As maxiXo) > ct, changing oy to occ{ceiling(YYQ^rnin(Yo))) 
decreases the number of occurrences of an index above a by 1. This upsets ox <iex oy- 
We therefore prune all values in T>{Yo) less than or equal to a. Now consider Y2. We have 
max{Y2) = a and min{Y2) < /3- Any value larger than (5 guarantees lexicographic ordering. 
The question is whether the values less than or equal to (3 have any support? Changing oy to 
occ(ceiling(lV2^miTi(y2))) decreases the number of occurrences of a by 1, giving us ox^ = oya- 
If min{Y2) = (3 then we have oxp = oyp. This disturbs ox <iex oy because 7 and a are both 
true. If min{Y2) < (3 then again we disturb ox <iex oy because 7 is true and the vectors are not 
lexicographically ordered as of p. So, we prune from 'D{Y2) the values less than or equal to (3. 
Now consider Yjj. As max{Y3) < a, changing oy to occ{ceilijig{YY^^min{Y3))) does not change 
that ox <if,x oy- Hence, min{Y3) is supported. We skip I4 and 1^5. 
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We have now the foUowing generahsed arc-consistent vectors: 

X = ({5}, {4}, {3,4}, {2}, {1}, {1}) 
Y = ({5}, {4}, {3,4}, {2,3}, {1}, {0}) 

4.2 Theoretical Background 

The algorithm exploits four theoretical results. The first reduces GAC to consistency on the 
upper bounds of X and on the lower bounds of Y . The second and the third show in turn 
when X <m Y is disentailed and what conditions ensure GAC on X <m Y . And the fourth 
establishes that two ground vectors are multiset ordered iff the associated occurrence vectors 
are lexicographically ordered. 

Theorem 1 GAC(X <m Y) iff for all < i < n, max{Xi) and min{Yi) are consistent. 

Proof: GAC implies that every value is consistent. To show the reverse, suppose for all 
< i < n, max{Xi) and min{Yi) are supported, but the constraint is not GAC. Then there is 
an inconsistent value. If this value is in some 'D{Xi) then any value greater than this value, in 
particular max{Xi), is inconsistent. Similarly, if the inconsistent value is in some 'D(Yi) then 
any value less than this value, in particular min(Yi), is inconsistent. In any case, the bounds 
are not consistent. QED. 

A constraint is said to be disentailed when the constraint is false. The next two theorems 
show when X <m Y is disentailed and what conditions ensure GAC on X <m Y . 

Theorem 2 X <m Y is disentailed ijff {{f loor(X)}} §ceiling(y)}}. 

Proof: (=^) Since X <m Y is disentailed, any combination of assignments, including X <— 
floor(X) and y <— ceiling(y), does not satisfy X <m Y. Hence, §floor(X)5 >m {[ceiling(y)§. 

(<^) Any X £ X is greater than any y £ Y under the multiset ordering. Hence, X <m Y is 
disentailed. QED. 

Theorem 3 GAC(X <mY) iff for all i in [0,n); 

{{floor(Xx^ )}} {{ceiling(y)}} (1) 

{{floor(X)}} {{ceiling(yy^^„i„(y^))}} (2) 

Proof: (=^) As the constraint is GAC, all values have support. In particular, Xi ^ max{Xi) 
has a support xi £ {x \ xi = max{Xi) A x £ X} and y[ £Y where {{aJi}} <m {{yl}}- Any X2 € 
{x \ Xi = max{Xi) A x £ X} less than or equal to xi, and any yi G y greater than or equal to y[, 
under multiset ordering, support Xi <— max(Xi). In particular, min{x \ Xi = max{Xi) A x £ 
X} and max{y \ y £ Y} support Xi <— max{Xi). We get min{x \ Xi = max{Xi) A x £ X} if 
all the other variables in X take their minimums, and we get max{y \ y £ Y} if all the variables 
in y take their maximums. Hence, ^floor{X Xi^max{Xi))^ {{ceiling(y)}}. 

A dual argument holds for the variables of Y. As the constraint is GAC, Yi ^ min{Yi) 
has a support xi £ X and y[ £ {y \ yi = min{Yi) A y £ Y} where {{xl}} <m {{yl}}- Any 
X2 £ X less than or equal to xi, and any yi £ {y I yi = rnin(Yi) A y £ y} greater than or 
equal to yl) in particular min{x \ x £ X} and max{y \ yi = min(Yi) A y £ y} support 
Yi ^ min(Yi). We get min{x \ x £ X} if all the variables in X take their minimums, and 
we get max{y \ yi = min{Yi) A y £ y} if all the other variables in Y take their maximums. 
Hence, {{floor(X)}} <m {{ceiling(yy^^„j„(y^))}}. 

(<^) Equation ([1]) ensures that for all < i < n, max[Xi) is supported, and Equation ([2]) 
ensures that for all < i < n, min{Yi) is supported. By Theorem [H the constraint is GAC. 
QED. 
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In Theorems [2] and O we need to check whether two ground vectors are multiset ordered. 
The following theorem shows that we can do this by lexicographically comparing the occurrence 
vectors associated with these vectors. 

Theorem 4 {{f}} <m {{y}} iff occ{x) <iex occ{y). 

Proof: (=^) Suppose {{x}} = {{y}}. Then the occurrence vectors associated with x and y 
are the same. Suppose {{x}} <m {{y}}- If m,ax{{x}} < max{{y}} then the leftmost index of 
ox = occ(x) and 6y = occ{y) is max{{y}}, and we have ox^cix{y} = and oy^naxly} > 0. This 
gives ox <iex oy- If max^x"^ = max-{[y§ = a then we eliminate one occurrence of a from each 
multiset and compare the resulting multisets. 

(<^=) Suppose occ{x) = occ{y). Then {{x}} and {{y}} contain the same elements with equal 
occurrences. Suppose occ(x) <iex occ{y). Then a value a occurs more in {{y}} than in and 
the occurrence of any value 6 > a is the same in both multisets. By deleting all the occurrences 
of a from ^x^ and the same number of occurrences of a from §y§, as well as any b > a from 
both multisets, we get max{{x}} < max{{y}}. QED. 

Theorems [2] and El together with Theorem |4] yield to the following propositions: 

Proposition 1 X <m Y is disentailed iff occ{flooz{X)) >iex occ(ceiliiig(y)). 
Proposition 2 GAC(X <my) iff for all i in [0,n): 

occ{floor{Xx,^max(x,))) <iex occ(ceiling(f )) (3) 
occ(floor(X)) <iex occ(ceiling(yy,<-mm(y,))) (4) 

A naive way to enforce GAC on X <m Y is going through every variable in the vectors, 
constructing the appropriate occurrence vectors, and checking if their bounds satisfy [3] and 
m If they do, then the bound is consistent. Otherwise, we try the nearest bound until we 
obtain a consistent bound. We can, however, do better than this by building only the vectors 
occ(f loor(X)) and occ(ceiling(y)), and then defining some pointers and Boolean flags on 
them. This saves us from the repeated construction and traversal of the appropriate occurrence 
vectors. Another advantage is that we can find consistent bounds without having to explore the 
values in the domains. 

We start by defining our pointers and flags. We write ox for occ(f loor(XV), and oy for 
occ(ceiling(y)). We assume ox and oy are indexed from u to /, and dx <iex cyu 

Definition 5 Given ox = occ(f loor(X)) and oy = occ(ceiling(y)) indexed as u..l where 
ox <iex oy, the pointer a is set either to the index in [u, I] such that: 

OXa < oya A 

yi u > i > a . oxi = oyi 

or (if this is not the case) to —oo. 

Informally, a points to the most significant index in [u, I] such that oXa < oya and all the 
variables above it are pairwise equal. If, however, ox = oy then a points to — oo. 

Definition 6 Given ox = occ(f loor(X)) and oy = occ(ceiling(y)) indexed as u..l where 
ox <iex oy; the pointer (3 is set either to the index in (a, /] such that: 

oxp > oyp A 

Vi a > i > f3 . oxi < oyi 

or (if a <l or for all a > i > I we have oxi < oyi) to —oo. 

^In the context of occurrence vector indexing, u..l and [u,l] imply u > I. The exact meaning of the these 
abused notations wiU be clear from the context. 
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Informally, P points to the most significant index in {a, I] such that oxj^^i >iex oVfs-^i- If, such 
an index does not exist, then (3 points to — oo. Note that we have Yli oxi = Yli ^Ui = n, as ox 
and dy are both associated with vectors of length n. Hence, a cannot be I, and we always have 
oxa-i^i >iex oya-1-,1 when a 7^ -oo. 

Definition 7 Given ox = occ(f loor(X)) and 6y = occ(ceiling(y)) indexed as u..l where 
ox <iex oy; t^e flag 7 is true iff: 

j3 7^ —00 A(/3 = a — 1 V yi a > i > (3 . oxi = oy-i) 

Informally, 7 is true if /? 7^ —00, and either (3 is jut next to a or the subvectors between a and 
(3 are equal. Otherwise, 7 is false. 

Definition 8 Given ox = occ(f loor(X)) and 6y = occ(ceiling(y)) indexed as u..l where 
ox <iex oy, the flag a is true iff: 

P> I A OXfS-i^l >lex oy 0-1^1 

Informally, a is true \i (3 > I and the subvectors below (3 are lexicographically ordered the wrong 
way. If, however, (3 < I, or the subvectors below /3 are lexicographically ordered, then a is false. 

Using a, (3, 7, and a, we can find the tight upper bound for each as well as the 

tight lower bound for each T>{Yi) without having to traverse the occurrence vectors. In the next 
three theorems, we are concerned with Xj. When looking for a support for a value v € V^Xi), 
we obtain occ{±loor{Xxi^v)) by increasing ox^ by 1, and decreasing ox^i^i^Xi) by 1. Since 
ox <iex oy, min{Xi) is consistent. We therefore seek support for values greater than min{Xi). 

Theorem 5 Given ox = occ(f loor(X)) and oy = occ(ceiling(y)) indexed as u..l where 
ox <iex oVj if rnax{Xi) > a and min{Xi) < a then for all v G ^{Xi): 

1. if V > a then v is inconsistent; 

2. if V < a then v is consistent; 

3. if V = a then v is inconsistent iff: 

{oxa + 1 = oya A min{Xi) = (3 A ^ A oxp > oyp + 1) V 
{oxa + 1 = oya A min{Xi) = /? A 7 A oxp = oyp + 1 A cr) V 

{oxa + 1 = oya A min{Xi) < /3 A 7) 

Proof: If min{Xi) < a then a 7^ —00 and ox <iex oy- Let u be a value in 'D{Xi) greater 
than a. Increasing ox^ by 1 gives ox >iex oy- By Proposition [2l u is inconsistent. Now let 
V be less than a. Increasing ox^ by 1 does not change ox <iex oy- By Proposition O t; is 
consistent. Is a a tight upper bound? If any of the conditions in item 3 is true then we obtain 
ox >iex oy by increasing oXa by 1 and decreasing oXminix^) by 1. By Proposition [2l v = a is 
inconsistent and therefore the largest value which is less than a is the tight upper bound. We 
now need to show that the conditions of item 3 are exhaustive. If u = a is inconsistent then, 
by Proposition [2l we obtain o~x >iex oy after increasing oXa by 1 and decreasing ox^in^Xt) by 
1. This can happen only if oXa + 1 = oya because otherwise we still have oXa < oya- Now, it 
is important where we decrease an occurrence. If it is above /3 (but below a as min(Xi) < a) 
then we still have ox <;ex oy because for all a > z > max{l — 1,(3}, we have oxi < oyi. If 
it is on or below (3 (when [3 7^ —00) and 7 is false, then we still have ox <iex oy because 7 
is false when (3 < a — 1 and o~Xa-i-*i3+i <iex 0Xa-i^i3+i- Therefore, it is necessary to have 
oxa+i + 1 = oya A min{Xi) < /3 A 7 for a to be inconsistent. Two cases arise here. In 
the first, we have oxa+i + 1 = oya A min{Xi) = /? A 7. Decreasing ox^ by 1 can give 
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>iex oy in two ways: either we still have oxjs > oyi3, or we now have 0x13 = oyp but 
the vectors below (3 are ordered lexicographically the wrong way. Note that decreasing oxp 
by 1 cannot give oxp < oyp. Therefore, the first case results in two conditions for a to be 
inconsistent: oXa+i + 1 = oya A min{Xi) = /? A 7 A oxp > oyp + 1 or oxa+i + 1 = 
oya A min{Xi) = /? A 7 A oxf^ = oyp + 1 A o". Now consider the second case, where we have 
oXq+i + 1 = oya A min{Xi) < /3 A 7. Decreasing ox^j„(Xj) by 1 gives ox >iex oy- Hence, if 
f = Q is inconsistent then we have either of the three conditions. QED. 

Theorem 6 Given ox = occ(f loor(X)) and oy = occ(ceiling(y)) indexed as u..l where 
ox <iex oy; if rnax{Xi) < a then max{Xi) is the tight upper bound. 

Proof: If max{Xi) < a then we have a / —00 and ox </ea- oy- Increasing ox^axiXi) by 1 does 
not change this. By Proposition [H max{Xi) is consistent. QED. 

Theorem 7 Given ox = occ(f loor(X)) and oy = occ(ceiliiig(y)) indexed as u..l where 
ox <iex oy, if min{Xi) > a then min{Xi) is the tight upper bound. 

Proof: Any v > min{Xi) in is greater than a. Increasing ox^ by 1 gives ox >iex oy- By 

Proposition O any v > min{Xi) in V{Xi) is inconsistent. QED. 

In the next four theorems, we are concerned with 1^. When looking for a support for a value 

V G 'D^Yi), we obtain occ(ceiling(yy.^^)) by increasing oy^ by 1, and decreasing oy^axiYi) by 1. 
Since o~x <iex oy-, rnaxiYi) is consistent. We therefore seek support for values less than max{Yi). 

Theorem 8 Given ox = occ(f loor(X)) and oy = occ(ceiling(y)) indexed as u..l where 
ox <iex oy; if rnax{Yi) = a and min{Yi) < (3 then for all v G T^iYi) 

1. if V > 13 then v is consistent; 

2. if V < 13 then v is inconsistent iff oXa + 1 = oya A 7 

3. if V = P then v is inconsistent iff: 

{oXa + 1 = oya A 7 A OX/3 > 02//3 + 1) V 
{oXa + 1 = oya A 7 A OX/3 = oy/3 + 1 A cr) 

Proof: If max{Yi) = a and miniYi) < (3 then a 7^ —00, f3 7^ —00, and ox <iex oy- Let 
u be a value in ^{Yi) greater than /3. Increasing oy^ by 1 and decreasing oya by 1 does not 
change ox <iex oy- This is because for all a > i > (3, we have oxj < oyi. Even if now 
6xa^v+i = oila^v+i^ '^^ havc oxy < oyv By Proposition [21 v is consistent. Now let v be 
less than /3. If the condition in item 2 is true then we obtain ox >iex oy by decreasing oya by 
1 and increasing oy^ by 1. By Proposition [2l v is inconsistent. We now need to show that this 
condition is exhaustive. If v is inconsistent then by Proposition [21 we obtain ox >iex oy after 
decreasing oya by 1 and increasing oy^ by 1. This is in fact the same as obtaining ox >/ex oy 
after increasing ox a by 1 and decreasing oxy by 1. We have already captured this case in the 
last condition of item 3 in Theorem [5l Hence, it is necessary to have oXa + 1 = oya A 7 for 

V to be inconsistent. What about /3 then? If any of the conditions in item 3 is true then we 
obtain ox >;ex oy by decreasing oya by 1 and increasing oyp by 1. By Proposition [21 v = P is 
inconsistent. In this case, the values less than /? are also inconsistent. Therefore, the smallest 
value which is greater than f3 is the tight lower bound. We now need to show that the conditions 
of item 3 are exhaustive. If f = /3 is inconsistent then by Proposition [21 we obtain ox >iex oy 
after decreasing oya by 1 and increasing oy/3 by 1. This is the same as obtaining ox >iex oy 
after increasing ox a by 1 and decreasing ox^ by 1. We have captured this case in the first 
two conditions of item 3 in Theorem [H Hence, v = P is inconsistent then we have either 
oxa+i + 1 = oya A 7 A OX/3 > oyp + 1 or oxa+i + 1 = oya A 7 A 0X/3 = oyp + 1 A cr. QED. 



14 



Algorithm 1: Initialise 

Data : (Xo, Xi, . . . , X„_i), (Fo, Vi, . . . , Kn-i> 

Result : occ(f loor(X)) and occ(ceiling(y)) are initialised, GAC(X <m Y) 

1 Z — mm({{floor(X)}} U {{f loor(y)}}); 

2 u ■- maa;({{ceiling(X)}} U {{ceiling(F)}}); 

3 dx :— occ(f loor(X)); 

4 c5y := occ(ceiling(y)); 

5 MsetLeq; 



Theorem 9 Given ox = occ(f loor(X)) and 6y = occ(ceiling(y)) indexed as u..l where 
ox <iex oil; if maxiYi) = a and miniYi) > (3 then min{Yi) is the tight lower hound. 

Proof: If max(Yi) = a then a ^ —oo and ox <iex oy- Increasing oy^in^Yi) by 1 and decreasing 
oya by 1 does not change o~x <if.x oy. This is because for all a > i > max{l — I,/?}, we have 
oxi < oyi. Even if now 0Xa^^in(Y,)+i = oya^min{Y,)+i: at min{Yi) we have ox^inij,) < oymin{Y,)- 
By Proposition [21 min(Yi) is consistent. QED. 

Theorem 10 Given ox = occ(f loor(X)) and oy = occ(ceiling(y)) indexed as u..l where 
ox <iex oy, if max(Yi) < a then min(Yi) is the tight lower bound. 

Proof: If max(Yi) < a then we have a ^ —oo and ox <iex oy- Decreasing oy^axiYi) by 1 does 
not change this. By Proposition [21 min(Yi) is consistent. QED. 

Theorem 11 Given ox = occ(f loor(X)) and oy = occ(ceiling(y)) indexed as u..l where 
ox <iex oVj if maxiYi) > a then max(Yi) is the tight lower bound. 

Proof: Decreasing oy^axiYi) by 1 gives ox >iex oy- By Proposition [21 any v < max{Yi) in 
T^lYi) is inconsistent. QED. 

4.3 Algorithm Details and Theoretical Properties 

In this subsection, we first explain MsetLeq as well as prove that it is correct and complete. We 
then discuss its time complexity. 

The algorithm is based on Theorems [Slllll The pointers and flags are recomputed every 
time the algorithm is called, as maintaining them incrementally in an easy way is not obvious. 
Fortunately, incremental maintenance of the occurrence vectors is trivial. When the minimum 
value in some DlXi) changes, we update ox by incrementing the entry corresponding to new 
min{Xi) by 1, and decrementing the entry corresponding to old min{Xi) by 1. Similarly, 
when the maximum value in some I'(li) changes, we update oy by incrementing the entry 
corresponding to new max(Yi) by 1, and decrementing the entry corresponding to old maxiYi) 
by 1. 

When the constraint is first posted, we need to initialise the occurrence vectors, and call 
the filtering algorithm MsetLeq to establish the generalised arc-consistent state with the initial 
values of the occurrence vectors. In Algorithm [H we show the steps of this initialisation. 

Theorem 12 Initialise initialises ox and oy correctly. Then it either establishes failure if 
X Y is disentailed, or prunes all inconsistent values from X and Y to ensure GAC(X <m 
Y)- 

Proof: Initialise first computes the most and the least significant indices of the occurrence 
vectors as u and / (lines 1 and 2). An occurrence vector occ(x) associated with x is indexed in 
decreasing order of significance from max^x^ to mm§x]}. Our occurrence vectors are associated 
with f loor(A) and ceiling(y) but they are also used for checking support for max{Xi) and 
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Algorithm 2: MsetLeq 



Data : (Xo, Xi, . . . , (Yi,, Yi, . . . , r„_i) 

Result : GAC(X <™ Y) 
AlSetPointersAndFlags; 
Blforeach i G [0, n) do 
B2 if min{Xi) 7^ max(Xi) then 
B3 if min{Xi) > a then setHax(.Xi,min{Xi)); 

B4 if max{Xi) > a A min{Xi) < a then 

B5 setMax(Xi,a); 

B6 if oxa + 1 = oyc, A min{Xi) = /3 A 7 then 

B7 if OX0 — oy/s + 1 then 

B8 I if (T then setMax(Xi,a — 1); 

else 

B9 I setMax(Xi,a - 1); 

end 

end 

if oXa + 1 = oyc, A min{Xi) < /3 A 7 then 

I setMax(Xi, a — 1) ; 
end 

end 



BIO 
Bll 



end 

end 

CI for each i G [0, n) do 



C2 
C3 
C4 
C5 
C6 
C7 
C8 

C9 



if min{Yi) 7^ max{Yi) then 

if max{Yi) > a then setyi±nCYi,max{Yi)); 
if mas;(Yi) = q A min{Yi) < f3 then 
if OXa + 1 = oyc, A 7 then 
setMinCVi,/?); 
if oa;;3 = oy^ + 1 then 

I if cr then setMindl,/? + 1); 
else 

I setMin(Yi,/3 + l); 
end 



end 



end 



end 



end 



min{Yi) for all < i < n. We therefore need to make sure that there are corresponding 
entries. Also, to be able to compare two occurrence vectors, they need to start and end with 
the occurrence of the same value. Therefore, u is max({[ceiling(A)}}- U •{[ceiling(y)}}-) and / 
is mm({{f loor(A)}} U {{f loor(y)}}). 

Using these indices, a pair of vectors ox and 6y of length u — l + \ are constructed and each 
entry in these vectors are set to 0. Then, ox^j„(Xi) ^-i^d oyjy^ax{Yi) incremented by 1 for all 
< i < n. Now, for all u > v > I, ox^ is the number of occurrences of v in §f loor(A)]}. 
Similarly, for all u > v > I, oy^ is the number of occurrences of v in §ceiling(y)§. This 
gives us ox = occ(f loor(A)) and 6y = occ{ceiling{Y)) (lines 3 and 4). Finally, in line 5, 
Initialise calls the filtering algorithm MsetLeq which either establishes failure if X Y is 
disentailed, or prunes all inconsistent values from X and Y to ensure GAC(A <m Y). QED. 

Note that when X <m Y is GAG, every value in P(Aj) is supported by {min{XQ), . . . , mm(Aj_i), 
mm(Aj_|_i), . . . ,mm(A„_i)), and {maxlYo), . . . ,max(Yn-i)). Similarly, every value in T^iYi) is 
supported by {min{Xo), . . . , min(A„_i)) and {max{YQ), . . . , max{Yi-i),max{Yi^i), . . . , max(y„_i)). 
So, MsetLeq is also called by the event handler whenever min{Xi) or max{Yi) of some i in [0, n) 
changes. 

In Algorithm[2l we show the steps of MsetLeq. Since ox and 6y are maintained incrementally, 
the algorithm first sets the pointers and flags in line Al via SetPointersAndFlags using the 
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Procedure SetPointersAndFlags 



1 i := u; 

2 while i > I A oxi = oyi do i := i ~ 1; 

3 if j > i A oxi > oj/i then fail; 

4 else if i = / — 1 then a ~ — oo; 

5 else a := i; 

6 if a < / then /3 := -co; 

7 else if a > Z then 

8 j := a ~ 1, temp := true; 

9 while j > I A oxj < oyj do 

10 if oXj < oyj then temp :— false; 

11 j ~ j-1; 
end 

12 if j = Z — 1 then (3 := — oo; 

13 else P := j; 
end 

14 7 := false, a := false; 

15 if /3 7^ — cxD A temp then 7 ~ true; 

16 if /3 > Z then 



17 
18 
19 



k~P-l; 

while k > I A oxk ~ oy^ do k :— k — 1; 
if k > I A oxk > oj/fc then a := true; 



end 



current state of these vectors. 

Theorem 13 SetPointersAndFlags either sets a, f3, 7, and a as per their definitions, or 
establishes failure as X <m Y is disentailed. 

Proof: Line 2 of SetPointersAndFlags traverses ox and oy, starting at index until either 
it reaches the end of the vectors (because ox = 6y), or it finds an index i where oxi 7^ oy,. 
In the first case, a is set to —00 (hne 4) as per Definition [5l In the second case, a is set to 
i only if oxi < oyi (line 5). This is correct by Definition [5] and means that ox <iex oy- If, 
however, oxi > oyi then we have ox >iex oy- By Proposition [H X <m Y is disentailed and thus 
SetPointersAndFlags terminates with failure (line 3). This also triggers the filtering algorithm 
to fail. 

If a < / then /3 is set to —00 (line 6) as per Definition [6l Otherwise, the vectors are traversed 
in lines 9-11, starting at index a — 1, until either the end of the vectors are reached (because 
oxa-i^i l^iex oya-i^i)i o^' index j where oxj > oyj is found. In the first case, P is set 
to —00 (line 12), and in the second case, P is set j (line 13) as per Definition [H During this 
traversal, the Boolean flag temp is set to true iff 0Xa_i^max{i,i3+i} = oya-i^max{i,i3+i}- 

In 

lines 14 and 15, 7 is set to true iff /? 7^ —00, and either /3 = a — 1 or temp is true (because 
03:^-1-^/3+1 = oya-i->i3+i)- This is correct by Definition [71 

In line 14, a is initialised to false. If /3 < / then a remains false (line 16) as per Definition 
[H Otherwise, the vectors are traversed in line 18, starting at index (3 — 1, until either the end of 
the vectors are reached (because oxp^i^i = d|/^_;^^;), or an index k where ox^ / oy^ is found. 
In the first case, a remains false as per Definition El In the second case, a is set to true only if 
oj;fc > oyk (line 19). This is correct by Definition [HI and means that dx^^i^i >iex oy^-i-^i ■ If, 
however, oxk < oyk then a remains false as per Definition [HI QED. 

We now analyse the rest of MsetLeq, where the tight upper bound for Xi and the tight lower 
bound for 1^, for all < i < n, are sought. 

Theorem 14 MsetLeq either establishes failure if X Y is disentailed, or prunes all incon- 
sistent values from X and Y to ensure GAC(X <m Y ). 

Proof: 
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If X <m Y is not disentailed then we have ox <iex oy by Proposition [TJ This means that 
min{Xi) and max{Yi) for ah < i < n are consistent by Proposition [2j The algorithm therefore 
seeks the tight upper bound for Xi only if max{Xi) > min{Xi) (lines B2-11), and similarly the 
tight lower bound for Yi only if min{Yi) < max{Yi) (lines C2-9). 

For each P(Xj): (1) If min{Xi) > a then all values greater than min(Xi) are pruned, 
giving min{Xi) as the tight upper bound (line B3). This is correct by Theorem [71 (2) If 
max(Xi) > a A min{Xi) < a then: 

• all values greater than a are pruned (line B5); 

• a is pruned if ox^ + 1 = oi/a A min{Xi) = (3 A ^ A oxp > oyp + 1 (line B9), or 
oxa + 1 = oya A min{Xi) = /? A 7 A oxp = oyp + 1 A o" (line B8), or ox^ + 1 = 
oya A min{Xi) < /5 A 7 (line Bll). 

All the values less than a remain in the domain. By Theorem [U all the inconsistent values are 
removed. (3) If, however, max{Xi) < a then max{X,i) is the tight upper bound by Theorem [6l 
and thus no pruning is necessary. 

For each T>{Yi): (1) If maxiYi) > a then all values less than max{Yi) are pruned, giving 
max{Yi) as the tight lower bound (line C3). This is correct by Theorem 111! (2) If maxiYi) = 
a A min{Yi) < (3 then: 

• all values less than fi are pruned if ox a + 1 = oya A 7 (line C6); 

• /? is pruned if oXa + 1 = oya A 7 A oxp > oyp + 1 (line C9) or oXa + 1 = oya A 7 A oxp = 
oyp + l A a (line C8). 

All the values greater than (3 remain in the domain. By Theorem [8l all the inconsistent values 
are removed. (3) If, however, maxiYi) = a A miniYi) > /3 or maxiYi) < a then miniYi) is the 
tight lower bound by Theorems [9] and [TOl and thus no pruning is needed. 

MsetLeq is a correct and complete filtering algorithm, as it either establishes failure if X <m 
Y is disentailed, or prunes all inconsistent values from X and Y to ensure GAC{X <m Y). 
QED. 

When we prune a value, we do not need to check recursively that previous support remains. 
The algorithm tightens max{Xi) and miniYi) without touching min{Xi) and maxiYi), for all 
< i < n, which provide support for the values in the vectors. The exception is if a domain 
wipe out occurs. As the constraint is not disentailed, we have ox <iex oy- This means min{Xi) 
and max{Yi) for all < z < n are supported. Hence, the prunings of the algorithm cannot cause 
any domain wipe-out. 

The algorithm works also when the vectors are of different length as we build and reason 
about the occurrence vectors as opposed to the original vectors. Also, we do not assume that 
the original vectors are of the same length when we set the pointer /3. 

The algorithm corrects a mistake that appears in [FIIK"'"03] . We have noticed that in 
FIIK"'"03] we do not always prune the values greater than a when we have max{Xi) > a and 
min{Xi) < a. As shown above, this algorithm is correct and complete. 

To improve the time complexity, we assume that domains are transformed so that their 
union is a continuous interval. Suppose, for instance, that we have variables with domains 
{1,5}, {1,100} and {5,100}. This transformation normalises the domains to {1,2}, {1,3} and 
{2,3}. This This technique is widely used (see for instance [KT05] ) and does not change the 
worst-case complexity of our propagator. It gives us a tighter upper bound on the complexity of 
our propagator in terms of the number of distinct values as compared to the diff'erence between 
the largest and smallest values. 

Theorem 15 Initialise runs in time 0{n + d), where d is the number of distinct values. 
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Proof: Initialise first constructs ox and 6y of length d where each entry is zero, and then 
increments ox^y^in^Xi) oy^ax{Yi) by 1 for all < i < n. Hence, the complexity of initialisation 
is 0{n + d). QED. ' 

Theorem 16 MsetLeq runs in time 0{nb+ d), where h is the cost of adjusting the hounds of a 
variable, and d is the number of different values. 

Proof: MsetLeq does not construct ox and oy, but rather uses their most up-to-date states. 
MsetLeq first sets the pointers and flags which are defined on ox and 6y. In the worst case both 
vectors are traversed once from the beginning until the end, which gives an 0{d) complexity. 
Next, the algorithm goes through every variable in the original vectors X and Y to check for 
support. Deciding the tight bound for each variable is a constant time operation, but the cost 
of adjusting the bound is h. Since we have 0(n) variables, the complexity of the algorithm is 
0{nb + d). QED. 

If d <C n then the algorithm runs in time 0{nb). Since a multiset is a set with possible 
repetitions, we expect that the number of distinct values in a multiset is often less than the 
cardinality of the multiset, giving us a linear time filtering algorithm. 

5 Multiset Ordering with Large Domains 

MsetLeq is a linear time algorithm in the n given that d <^ n. If instead we have n <^ d 
then the complexity of the algorithm is 0{d), dominated by the cost of the construction of 
the occurrence vectors and the initialisation of the pointers and flags. This can happen, for 
instance, when the vectors being multiset ordered are variables in the occurrence representation 
of a multiset |KW02] . Is there then an alternative way of propagating the multiset ordering 
constraint whose complexity is independent of the domains? 

5.1 Remedy 

In case d is a large number, it could be costly to construct the occurrence vectors. We can instead 
sort floor(X) and ceiling(y), and compute a, /?, 7, a, and the number of occurrences of a 
and P in {[f loor(X)}}- and •{[ceiling(y)]} as if we had the occurrence vectors by scanning these 
sorted vectors. This information is all we need to find support for the bounds of the variables. 
Let us illustrate this on an example. To simplify presentation, we assume that the vectors are 
of the same length. Consider X Y where s'x = sort{floor{X)) and s~y = sort(ceiling(y)) 
are as follows: 

sx = (5, 4, 3, 2, 2, 2, 2, 1) 
sy = (5, 4, 4, 4, 3, 1, 1, 1) 

We traverse s'x and s~y until we find an index i such that sxi < syi, and for all < t < i we have 
sxt = syt- In our example, i is 2: 

a 

sx = (5, 4, 3, 2, 2, 2, 2, 1) 
sy = (5, 4, 4, 4, 3, 1, 1, 1) 

This means that the number occurrences of any value greater than syi are equal in §f loor(X)]j- 
and in -{[ceiling(y)]}-, but there are more occurrence of syi in §ceiling(y)§ than in §f loor(X)]}. 
That is, 0x5 = o?/5 and 0x4 < oy^. By Definition [5l a is equal to 4. We now move only along 
sy until we find an index j such that syj 7^ syj-i, so that we reason about the number of 
occurrences of the smaller values. In our example, j is 4: 

sx = (5, 4, 3, 2, 2, 2, 2, 1) 
sy = (5, 4, 4, 4, 3, 1, 1, 1) 

Ti 
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Algorithm 4: Initialise 

Data : (Xo, Xi, . . . , X„_i), (Fo, Vi, . . . , V„_i> 

Result : sort(f loor(X)) and sort{ceiling{Y)) are initialised, GAC(X <,„ F) 

1 sir := sort(f loor(X)); 

2 sy sorf(ceiling(y)); 

3 MsetLeq; 



We here initialise 7 to true, and start traversing sx and sy simultaneously. We have sxi = syj = 
3. This adds 1 to 0x3 and 07/3, keeping 7 = true. We move one index ahead in both vectors by 
incrementing i to 3 and j to 5: 

a 

sx = (5, 4, 3, 2, 2, 2, 2, 1) 

# = (5, 4, 4, 4, 3, 1, 1, 1) 

We now have sxi > syj, which suggests that sxi occurs at least once in {{f loor(X)}} but does 
not occur in §ceiling(y)5. That is, 0x2 > and 01/2 = 0. By Definition [6l /? points to 2. This 
does not change that 7 is true. We now move only along sx by incrementing i until we find 
sxi 7^ sxi-i, so that we reason about the number of occurrences of the smaller values: 

sx = (5, 4, 3, 2, 2, 2, 2, 1) 

# = (5, 4, 4, 4, 3, 1, 1, 1) 

With the new value of i, we have sxi = syi = 1. This increases both 0x1 and oyi by one. Reaching 
the end of only sx hints the following: either 1 occurs more than once in •{[ceiling(y)§, or 
it occurs once but there are values in §ceiling(y)]} less than 1 and they do not occur in 
{{floor(X)}}. By Definition El 7 is false. 

Finally, we need to know the number of occurrences of a and [j in §floor(X)§ and 
§ceiling(y)§. Since we already know what a and (5 are, another scan of sx and sy gives 
us the needed information: for all < i < n, we increment oxa (resp. oxp) by 1 if (resp. 
sxi = P), and also oya (resp. oyp) by 1 if .syi = a (resp. syi = (3). 

5.2 An Alternative Filtering Algorithm 

As witnessed in the previous section, it suffices to sort f loor(X) and ceiling(y), and scan the 
sorted vectors to compute a, /3, 7, a, oXq, oya, oxp, and oyp. We can then directly reuse lines 
Bl-11 and Cl-9 of MsetLeq to obtain a new filtering algorithm. As a result, we need to change 
only Initialise and SetPointersAndFlags. 

In Algorithm m we show the new Initialise. Instead of constructing a pair of occurrence 
vectors associated with f loor(A) and ceiling(y), we now sort f loor(A) and ceiling(y) and 
then call MsetLeq. 

Similar to the original algorithm, we recompute the pointers and flags every time we call the 
filtering algorithm. Maintaining the sorted vectors incrementally is trivial. When the minimum 
value in some I5(Aj) changes, we update sx by inserting the new min{Xi) into, and removing the 
old min{Xi) from sx. Similarly, when the maximum value in some T>(^i) changes, we update 
sy by inserting the new maxiYi) into, and removing the old max(Yi) from sy. Since these 
vectors need to remain sorted after the update, such modifications require binary search. The 
cost of incrementality thus increases from 0(1) to 0{log{n)) compared to the original filtering 
algorithm. 
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Procedure SetPointersAndFlags 



1 i ■- 0; 

2 while i < n A sxi = syi do i := i + 1; 

3 if i < n A sxi > syi then fail; 

4 else if I = n then a := — oo, /3 := — oo, 7 := false, a := false, return; 

5 else a := syi; 
67:= true; 

7 j:=i + l; 

8 while j < n A syj = syj-i do j := j + 1; 

9 if j = n then := sxi; 

10 else if j < n then 



11 while i < n A j < n do 

12 if sXi > syj then (5 := sXi, break; 

13 if SXi < syj then 7 := false, j := j + 1; 

14 if SXi = syj then i := i + 1, j := j + 1; 

end 

15 if j = n then fi := sXi; 
end 

16fc:=i + l; 

17 while k < n A sxu = sxk-i do k := k + 1; 

18 if k = n then a := false; 

19 else if fc < n then 

20 while k < n A j < n do 

21 if sxk > syj then a := true, break; 

22 if sxk < syj then a := false, break; 

23 if sXk = syj then k := k + 1, j := j + 1; 
end 

24 if = n then a := false ; 

25 else if j = n then 

I a := true; 
end 

end 

26 i := 0, oXa = 0, oya — 0, oxf) — 0, oyp — 0; 

27 foreach i G [0, n) do 

28 if SXi = Q then oXa := oXa + 1; 

29 if SXi = /3 then oxg := oxg + 1; 

30 if syi = Q then oya := oya + 1; 

31 if syi = P then oy/s := oyp + 1; 
end 



Given the most up-to-date sx and sy, how do we set our pointers and flags? In hne 2 of our 
new SetPointersAndFlags, we traverse sx and sy, starting at index 0, until either we reach the 
end of the vectors (because the vectors are equal), or we find an index i where sxi 7^ syi. In the 
first case, we first set a and (3 to —00, and 7 and a to false, and then return (line 4). In the 
second case, if sXi > syi then disentailment is detected and SetPointersAndFlags terminates 
with failure (line 3). The reason of the return and failure is due to the following theoretical 
result. 

Theorem 17 occ{x) <iex occ{y) iff sort{x) <if.x sort{y). 

Proof: {=>) If occ(x) <iex occ{y) then a value a occurs more in {{y}} than in {{x}}, and the 
occurrence of any value 6 > a is the same in both multisets. By deleting all the occurrences 
of a from {{.xD- and the same number of occurrences of a from {{y}}, as well as any h > a from 
both multisets, we get maa;{{x}} < max{{y}}. Since the leftmost values in sort{x) and sort{y) 
are max{{x}} and max{{y}} respectively, we have sort{x) <iex sort{y). If occ{x) = occ{y) then 
we have {{x}} = ^y^- By sorting the elements in x and y, we obtain the same vectors. Hence, 
sort[x) = sort{y). 

(<^) Suppose ox = occ{x), 6y = occ{y), sTc = sort{x), sy = sort{y), and we have sx = s~y. 
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Then {{x}} and {{?/}} contain the same elements with equal occurrences. Hence, ox = dy. Suppose 
sx <iex sy. If sxq < syo then the leftmost index of ox and 6y is syo, and we have oxgy^ = and 
oysya > 0. This gives ox <iex oy- If sxq = syo = a then we eliminate one occurrence of a from 
■^x}} and -S^y}}, and compare the resulting multisets. QED. 

Hence, whenever we have sx >iex sy, we proceed as if we had occ(f loor(X)) >;ea. occ(ceiling 
But then what do we do if wc have s~x <igx sy7 In line 5, we have sxi < syi and sxt = syt for 
all < t < i. This means that the number occurrences of any value greater than syi are equal 
in {{f loor(X)}} and in {{ceiling(y)}}, but there are more occurrence of syi in ■§[ceiling(y)}} 
than in {{f loor(X)}}. Therefore, we here set a to syi. 

After initialising 7 to true in line 6, we start seeking a value for [3. For the sake of simplicity, 
we here assume our original vectors are of same length. Hence, (3 cannot be — cxd as a is not 
—00. In line 8, we traverse sy, starting at index i + 1, until either we reach the end of the 
vector (because all the remaining values in •g[ceiling(y)}} are syi), or we find an index j such 
that syj ^ syj-i. In the first case, we set /? to sxi (line 9) because sXi occurs at least once 
in §floor(X)§ but docs not occur in {[ceiling(y)}}. Since no value between a and P occur 
more in §ceiling(y)}} than in {[f loor(X)][, 7 remains true. In the second case, syj gives us 
the next largest value in {{ceiling(y)}}. In lines 11-14, we traverse s'x starting from i, and 
s~y starting from j. If sxi > syj then we set (3 to sxi (line 12) because sxi occurs more in 
•^f loor(X)}}- than in -|[ceiling(y)]^. Having found the value of /?, wc here exit the while loop 
using break. If sxi < syj then syj occurs more in §ceiling(y)§ than in {[f loor(X)][. Since 
we are still looking for a value for f3, we set 7 to false (line 13). We then move to the next index 
in sy to find the next largest value in •§[ceiling(y)}}. If sXi = syj then we move to the next 
index both in s'x and s~y to find the next largest values in -g^f loor(X)§^ and •5ceiling(y)]g^ (line 
14). As j is at least one index ahead of i, j can reach to n before i does during this traversal. 
In such a case, we set [3 to sxi (line 15) due to the same reasoning as in line 12. 

The process of finding the value of a (lines 16-25) is very similar to that of /3. In line 17, we 
traverse s'x, starting at index i + 1, until either we reach the end of the vector (because all the 
remaining values in §floor(X)5 arc (3), or we find an index k such that sxk 7^ sxk-i. In the 
first case, we set a to false (line 18) because either syj occurs at least once in {[ceiling(y)§ 
but does not occur in {{floor(X)}} (due to line 12), or there are no values less than (3 both in 
•g[f loor(X)^ and in •g[ceiling(y)]3- (due to line 15). In the second case, sxk gives us the next 
largest value in -^f loor(X)]5-. In lines 20-23, wc traverse ,sx starting from k, and s~y starting 
from j. The reasoning now is very similar to that of the traversal for (3. Instead of setting a 
value for /3, we set a to true, and instead of setting 7 to false, we set a to false, for the same 
reasons. If k reaches n before j, then we set a to false (line 24) due to the same reason as in line 
22. If k and j reach n together, then again we set a to false, because we have the same number 
of occurrences of any value less than (3 in {[f loor(X)}}- and in {[ceiling(y)][. If, however, j 
reaches n before k, then we set a to true (line 25) due to the same reason as in line 21. 

Finally, we go through each of sxi and syi in lines 26-31, and find how many times a and /? 
occur in loor(X) J and in •^ceiling(y)]^, by counting how many times a and /? occur in s~x 
and in sy, respectively. 

The complexity of this new algorithm is independent of the domains and is 0{n login)), as 
the cost of sorting dominates. 

6 Extensions 

In this section, we answer two important questions. First, how can we enforce strict multiset 
ordering? Second, how can we detect entailment? 
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6.1 Strict Multiset Ordering Constraint 

We can easily get a filtering algorithm for strict multiset ordering constraint by slightly modifying 
MsetLeq. This new algorithm, called MsetLess, either detects the disentailment of X <m 
or prunes inconsistent values to perform GAC on X <m Y. Before showing how we modify 
MsetLeq, we first study X <m Y from a theoretical point of view. It is not difficult to modify 
Theorems [2j [3] and S] so as to exclude the equality and obtain the following propositions: 

Proposition 3 X Y is disentailed iff occ{floor{X)) >iex occ(ceiling(y)). 

Proposition 4 GACfX Y) iff for all i in [0,n); 

occ{ilQor{Xx^-^max{Xi))) <iex occ{ce±l±iig{Y)) 

OCc(floor(X)) <iex OCc{ce±l±ILg{YY^^.rain{Y,))) 

We can exploit the similarity between Proposition [2] and HI and find the tight consistent bounds 
by making use of the occurrence vectors ox = occ(f loor(X)) and 6y = occ(ceiling(y)), the 
pointers, and the flags. In Theorems [5] to [TTl we have ox <iex oy- We decide whether a value 

V in some domain D is consistent or not by first increasing ox^/oy^ by 1, and then decreasing 
min{D) / max{D) by 1. The value is consistent for X <m Y iff the change gives ox <iex oy- In 
Theorems 171 and II H changing the occurrences gives ox >iex oy- This means that v is inconsistent 
not only for X Y but also for X <m Y. In Theorems [6l [9l and [TOl however, we initially 
have ox <iex oy and changing the occurrences does not disturb the strict lexicographic ordering. 
This suggests v is consistent also for X <m Y . 

In Theorems [5] and El we initially have ox <iex oy-, and after the change we obtain either of 
ox >iex oy-, ox = oy, and ox <iex oy- In the first case v is inconsistent, whereas in the third case 

V is consistent, for both constraints. In the second case, however, v is consistent for X <m Y 
but not for X <m Y. This case arises if we get ox^j^^ = mju^is by the change to the occurrence 
vectors, and we have either /3 > I and d3;/3_i^/ = 6yp_i^i, oi (3 = 1. We therefore need to record 
whether there are any subvectors below (3, and if this is the case we need to know whether 
they are equal. This can easily be done by extending the definition of a which already tells us 
whether we have (3 > I and o2;/3_i__>z >iex 6yp-.i-^i- 

Definition 9 Given ox = occ(f loor(X)) and oy = occ(ceiling(y)) indexed as u..l where 
ox <iex oy, the flag a is true iff: 

{(3 > I A oxfs^i^i >iex oyp^i^i) y (3 = 1 

Theorems [5] and [8] now declare a value inconsistent if we get oxu-tp = oyu-^fB when the occurrence 
vectors change, and we have either (3 > I and d3;/3_i_>/ = 6yp_i^i, or (3 = I. 

How do we now modify MsetLeq to obtain the filtering algorithm MsetLess? Theorems [6l 
EUSmni and [TT] are valid also for X <m Y. Moreover, Theorems [5] and [8] can easily be adapted 
for X <m Y by changing the definition of a. Hence, the pruning part of the algorithm need not 
to be modified, provided that a is set correctly. Also, by Proposition [3l we need to fail under 
the new disentailment condition. These suggest we only need to revise SetPointersAndFlags, 
so that we fail whenever we have ox >iex oy, and set a to true also when we have /? = /, or 
(3 > I and ox^_i^i = ox^-i^;. This corrects a mistake in j FHK^03 which claims that failing 



whenever we have o~x >iex oy and setting /3 to / — 1 as opposed to — oo are enough to achieve 
strict multiset ordering. 

6.2 Entailment 

MsetLeq is a correct and complete filtering algorithm. However, it does not detect entailment. 
Even though detecting entailment does not change the semantics of the algorithm, it can lead to 
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Algorithm 6: Initialise 

Data : (Xo, Xi, . . . , X„_i), (Yo, Vi, . . . , Vn-i> 

Result : occ(f loor(X)), occ(ceiling(F)), occ(ceiling(X)), occ(f loor(y)), and entailed are ini- 
tialised, GAC(X <„ Y) 
entailed :— false; 

5 ea; := occ(ceiling(X)); 

6 ey :— occ(f loor(y)); 

7 MsetLeq; 



significant savings from an operational point of view. We thus introduce another Boolean flag, 
called entailed, which indicates whether X <m Y is entailed. More formally: 

Definition 10 Given X and Y , the flag entailed is set to true iff X <m Y is true. 

The multiset ordering constraint is entailed whenever the largest value that X can take is 
less than or equal to the smallest value that Y can take under the ordering in concern. 

Theorem 18 X <„ F is entailed ijff {{ceiling(X)}} {{f loor(y)}}. 

Proof: (^) Since X <m Y is entailed, any combination of assignments, including X <— 
ceiling(X) and Y ^ floor(y), satisfies X Y. Hence, {{ceiling(X)}} <m {{f loor(y)}}. 

(<^) Any X £ X is less than or equal to any y £Y under multiset ordering. Hence, X Y 
is entailed. QED. 

By Theorems U] and [THl we can detect entailment by lexicographically comparing the occur- 
rence vectors associated with ceiling(A) and f loor(y). 

Proposition 5 X Y is entailed iff occ{ceiling{X)) <iex occ(f loor(y)). 

When MsetLeq is executed, we have three possible scenarios in terms of entailment: (1) 
X <m Y has already been entailed in the past due to the previous modifications to the variables; 
(2) X <m Y was not entailed before, but after the recent modifications which invoked the 
algorithm, X <m Y is now entailed; (3) X <m Y has not been entailed, but after the prunings 
of the algorithm, X <m Y is now entailed. In all cases, we can safely return from the algorithm. 
We need to, however, record entailment in our flag entailed in the second and the third cases, 
before returning. 

To deal with entailment, we need to modify both Initialise and MsetLeq. In Algorithm 
m we show how we revise Algorithm [TJ We add line to initialise the flag entailed to false. 
We replace line 5 of Algorithm [1] with lines 5-7. Before calling MsetLeq , we now initialise 
our new occurrence vectors occ(ceiling(X)) and occ(f loor(y)) in a similar way to that of 
occ(f loor(X)) and occ(ceiling(y)): we create a pair of vectors ex and ey of length u — I + 1 
where each ext and eyt are first set to 0. Then, for each value v in §ceiling(A)5, we increment 
ex^ by 1. Similarly, for each v in {[f loor(y)]}, we increment ey^ by 1. These vectors are then 
used in MsetLeq to detect entailment. It is possible to maintain ex and ey incrementally. 
When the maximum value in some V^Xi) changes, we update ex by incrementing the entry 
corresponding to new max(Xi) by 1, and decrementing the entry corresponding to old max{Xi) 
by 1. Likewise, when the minimum value in some T){Yi) changes, we update ey by incrementing 
the entry corresponding to new min{Yi) by 1, and decrementing the entry corresponding to old 
min(Yi) by 1. 

In Algorithm [TJ we show how we modify the filtering algorithm given in Algorithm [2] to 
deal with the three possible scenarios described above. We add line AO where we return if 
the constraint has already been entailed in the past. Moreover, just before setting our pointers 
and flags, we check whether the recent modifications that triggered the algorithm resulted in 
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Algorithm 7: MsetLeq 



Data : (Xo, Xi, . . . , X„_i), (Yo, Yi, . . . , y„-i) 

Result : GAC{X Y) 
AOif entailed then return; 
=> if ea; <iex ey then entailed := true, return; 
AlSetPointersAndFlags; 
Blforeach i G [0, n) do 



B2 
B3 



B4 

B5^ 



if rnin{Xi) ^ jnax(Xi) then 
if min{Xi) > a then 

eXmax(Xi) ■■= eXmax(Xi) - 1, setMax (Xj , mm(Xj)) ; 

end 

if moa;(Xi) > a A min{Xi) < a then 

eSmaxCXj) := ea^axCXi) — 1, setMax(X<, a); 



end 



end 

end 

if elr <(e2, ey then entailed := true, return; 
Clforeach i G [0, n) do 
C2 
C3 



C4 
C5 

C6^ 



if min{Yi) ^ moxCK,) then 
if max(Yi) > a then 

I eymin(Yi) ■= eymin(Yi) - 1, setHiiL(.Yi,max(Yi)), eymin(Yi) ■= ey^mCi-,) + 1; 
end 

if max{Yi) = a A min{Yi) < /3 then 
if oxa + 1 = oya A 7 then 

eVminiYi) ■■= eymin(Yi) - 1, setMinCFi , /3) ; 



eymin(Yi) ■= eymin(Yi) + 1 



end 



end 



end 

end 

=^ if ex <iex ey then entailed := true, return; 



entailment. If this is the case, we first set entailed to true and then return from the algorithm. 
Furthermore, we check entailment after the algorithm goes through its variables. Lines Bl-Bll 
visit the variables of X and prune inconsistent values from the upper bounds, affecting e~x. 
Even if we have ex >iex ey when the algorithm is called, we might get ex <iex ey just before 
the algorithm proceeds to the variables of Y . In such case, we return from the algorithm after 
setting entailed to true. As an example, assume we have X Y , and MsetLeq is called with 
X = ({1,2}, {1,2, 4}) and Y = ({2, 3}, {2, 3}). As 4 in V{Xi) lacks support, it is pruned. Now 
we have ex = ey. Alternatively, the constraint might be entailed after the algorithm visits the 
variables of Y and prunes inconsistent values from the lower bounds, affecting ey. In this case, 
we return from the algorithm by setting entailed to true. As an example, assume we also have 
in T>(Yi) in the previous example. The constraint is entailed only after the variables of Y are 
visited and is removed. 

Finally, before/after the algorithm modifies max{Xi) or min(Yi) of some i in [0,n), we 
keep our occurrence vectors ex and ey up-to-date by decrementing/incrementing the necessary 
entries. 
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7 Alternative Approaches 



There are several alternative ways known for posting and propagating multiset ordering con- 
straints. We can, for instance, post arithmetic inequality constraints, or decompose multiset 
ordering constraints into other constraints. In this section, we explore these approaches and ar- 
gue why it is preferable to propagate multiset ordering constraints using our filtering algorithms. 

7.1 Arithmetic Constraint 

We can achieve multiset ordering between two vectors by assigning a weight to each value, 
summing the weights along each vector, and then insisting the sums to be non-decreasing. Since 
the ordering is determined according to the maximum value in the vectors, the weight should 
increase with the value. A suitable weighting scheme was proposed in |KS02| . where each value 
V gets assigned the weight n'" , where n is the length of the vectors. X <m Y on vectors of length 
n can then be enforced via the following arithmetic inequality constraint: 

+ . . . + n-^""^ < n^" + . . . + n^"-i 

Therefore, a vector containing one element with value v and n — 1 Os is greater than a vector 
whose n elements are only v — 1. This is in fact similar to the transformation of a leximin fuzzy 
CSP into an equivalent MAX CSP [SFV95j . Strict multiset ordering constraint X <m Y is 
enforced by disallowing equality: 

+ . . . + n^"-^ <n^° + ...+ n^"'^ 

BC on such arithmetic constraints does the same pruning as GAC on the original multiset 
ordering constraints. However, such arithmetic constraints are feasible only for small n and u, 
where u is the maximum value in the domains of the variables. As n and u get large, n^' or 
nX"- will be a very large number and therefore it might be impossible to implement the multiset 
ordering constraint. Consequently, it can be preferable to post and propagate the multiset 
ordering constraints using our global constraints. 

Theorem 19 GAC(X <m Y) and GAC(X <m Y) are equivalent to BC on the corresponding 
arithmetic constraints. 

Proof: We just consider GAC{X <m Y) as the proof for GAC(X <m Y) is entirely analogous. 
As X <m Y and the corresponding arithmetic constraint are logically equivalent, BC(X <m Y) 
and BC on the arithmetic constraint are equivalent. By Theorem [H BC(X Y) is equivalent 
to GAC(X <m Y). QED. 

7.2 Decomposition 

Global ordering constraints can often be built out of the logical connectives (A, V, and 
and existing (global) constraints. We can thus compose other constraints between X and Y 
so as to obtain the multiset ordering constraint between X and Y . We refer to such a logical 
constraint as a decomposition of the multiset ordering constraint. 

The multiset view of two vectors of integers x and y are multiset ordered {{x]} <m iff 
occ{x) <iex occ{y) by Theorem [H One way of decomposing the multiset ordering constraint 
X <m Y is thus insisting that the occurrence vectors associated with the vectors assigned to X 
and Y are lexicographically ordered. Such occurrence vectors can be constructed via an extended 
global cardinality constraint {gcc). Given a vector of variables X and a vector of values d, the 
constraint gcc{X ,d,OX) ensures that OXi is the number of variables in X assigned to dj. To 
ensure multiset ordering, we can enforce lexicographic ordering constraint on a pair of occurrence 
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vectors constructed via gcc where d is the vector of values that the variables can be assigned to, 
arranged in descending order, without any repetition: 



gcc{X,d,OX) A gcc{Y,d,OY) A OX <ie^ OY 

In order to decompose the strict multiset ordering constraint X <m Y, we need to enforce strict 
lexicographic ordering constraint on the occurrence vectors: 

gcc{X,d,0'X) A gcc{Y,d,OY) A 0~X <ie,c OY 

We call this way of decomposing a multiset ordering constraint as gcc decomposition. 

The gcc constraint is available in, for instance, ILOG Solver 5.3 |ILO02j . SICStus Prolog 
3.10.1 |SIC04] . and the FaCiLe constraint solver 1.0 [FaCOlj . These solvers propagate the gcc 



constraint using the algorithm proposed in Reg96 . Among the various filtering algorithms of 



gcc, which maintain either GAG |Reg96|[QvBL+03| or BG |QvBL+03| [KT05|, only the algo- 



rithms in |KT05j prune values from OX and OY. Even though the algorithm integrated in 
ILOG Solver 5.3 may also prune the occurrence vectors, this may not always be the case. For 
instance, when we have gcc{{{l}, {1,2}, {1, 2}, {2}, {3, 4}, {3, 4}), (4, 3, 2, 1), ({1}, {1}, {1, 2}, 
{1,2,3}), ILOG Solver 5.3 leaves OX unchanged even though 1 in V{OX3) is not consistent. 
This shows that there is currently very limited support in the constraint toolkits to propagate 
the multiset ordering constraint using the gcc decomposition. Also, as the following theorems 
demonstrate, the gcc decomposition of a multiset ordering constraint hinders constraint propa- 
gation. 

Theorem 20 GAC(X <m Y) is strictly stronger than GAC(gcc(X ,d,0~X )), GAC(gcc(Y, 
d,OY)), and GAC(OX <iex OY), where d is the vector of values that the variables can take, 
arranged in descending order, without any repetition. 

Proof: Since X <m Y is GAG, every value has a support x and y where occ{x) <iex occ{y), 
in which case all the three constraints posted in the decomposition are satisfied. Hence, every 
constraint imposed is GAG, and GAG(X <m Y) is as strong as its decomposition. To show 
strictness, consider X = ({0,3}, {2}) and Y = ({2,3}, {1}). The multiset ordering constraint 
X <mY is not GAG as 3 in V{Xo) has no support. By enforcing GAC{gcc{X, (3, 2, 1, 0), OX)) 
and GAC{gcc(Y , (3, 2, 1, 0), OY)) we obtain the following occurrence vectors: 

OX = ({0,1}, {1}, {0}, {0,1}) 
OY = ({0,1}, {0,1}, {1}, {0}) 

Since we have GAC{OX <iex OY), X and Y remain unchanged. QED. 

Theorem 21 GAG(X <ra Y) is strictly stronger than GAG(gcc(X,d,0~X)), GAC(gcc(Y, 
d,OY)), and GAG(OX <iex OY), where d is the vector of values that the variables can take, 
arranged in descending order, without any repetition. 

Proof: The example in Theorem 1201 shows the strictness. QED. 

In Theorem \T7\ we have established that occ(x) <iex occ{y) iff sort{x) <iex sort{y). Putting 
Theorems HI and 1171 together, the multiset view of two vectors of integers x and y are multiset 
ordered {{x}} <m {{y}} iff sort{x) <iex sort{y). This suggests another way of decomposing a 
multiset ordering constraint X <m Y: we insist that the sorted versions of the vectors assigned 
to X and Y are lexicographically ordered. For this purpose, we can use the constraint sorted 
which is available in, for instance, EGLiPSe constraint solver 5.6 |EGL03j . SIGStus Prolog 
3.10.1 |SIG04] . and the FaGiLe constraint solver 1.0 |FaG01| . Given a vector of variables X, 
sorted{X, SX) ensures that SX is of length n and is a sorted permutation of X. To ensure 
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multiset ordering, we can enforce lexicographic ordering constraint on a pair of vectors which 
are constrained to be the sorted versions of the original vectors in descending order: 

sorted{X,SX) A sorted{Y,SY) A SX SY 

A strict multiset ordering constraint X <m Y is then achieved by enforcing strict lexicographic 
ordering constraint on the sorted vectors: 

sorted{X,SX) A sorted{Y,SY) A SX <iex SY 

We call this way of decomposing a multiset ordering constraint as the sort decomposition. 

The sorted constraint has previously been studied and some BC filtering algorithms have 
been proposed |BC97 ] [BCOO] [MTOO] . Unfortunately, we lose in the amount of constraint prop- 
agation also by the sort decomposition of a multiset ordering constraint. 

Theorem 22 GAC(X Y) is strictly stronger than GAC(soHed(X , SX )), GAC(sorted 
(Y,SY)), and GAC(SX <i^^ SY). 

Proof: Since X <m Y is GAC, every value has a support x and y where sort{x) <iex sort{y), 
in which case all the three constraints posted in the decomposition are satisfied. Hence, every 
constraint imposed is GAC, and GAC{X <m Y) is as strong as its decomposition. To show 
strictness, consider X = ({0,3}, {2}) and Y = ({2,3},{1}). The multiset ordering constraint 
X <m Y is not GAC as 3 in V{Xq) has no support. By enforcing GAC{sorted{X , SX)) and 
GAC{sorted{Y 1 SY)) we obtain the following vectors: 

SX = ({2,3}, {0,2}) 
SY = ({2,3}, {1}) 

Since we have GAC{SX <iex SY), X and Y remain unchanged. QED. 

Theorem 23 GAG(X Y) is strictly stronger than GAG(sorted(X , SX )), GAG(sorted 
(Y,SY)), and GAG(SX <iex SY). 

Proof: The example in Theorem 1221 shows strictness. QED. 

How do the two decompositions compare? Assuming that GAC is enforced on every ?i-ary 
constraint of a decomposition, the sort decomposition is superior to the gee decomposition. 

Theorem 24 The sort decomposition of X <m Y is strictly stronger than the gcc decomposition 

0fX<,nY. 

Proof: Assume that a value is pruned from X due to the gcc decomposition. Then, there is 
an index a such that ^{OXq, = OY^) and for all i > a we have OXi = OYi. Moreover, we 
have min{OXi) = max{OYi) and max{OXi) > max{OYi). The reason is that, only in this 
case, GAC (OX <iex OY) will not only prune values from OXa but also from X. In any other 
case, we will either get no pruning at OXa, or the pruning at OXa will reduce the number of 
occurrences of a in X without deleting any of a from X. Now consider the vectors SX and 
SY. We name the index of SX and SY, where a first appears in the domains of SX and SY, 
as i. Since the number of occurrences of any value greater than a is already determined and is 
the same in both X and Y, the subvectors of SX and SY above i are ground and equal. For 
all i < j < i + min{OXi), we have SXj = SYj ^ a. Since max{OXi) > max{OYi), at position 
k = i + min{OXi) we will have a in ^{SXk) but not in ^{SYk) whose values are less than a. 
To have SX <;ea; SY , a in 'D{SXk) is eliminated. This propagates to the pruning of a from 
the remaining variables of SX, as well as from domains of the uninstantiated variables of X. 
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Hence, any value removed from X due to the gcc decomposition is removed from X also by the 
sort decomposition. The proof can easily be reverted for values being removed from Y . 

To show that the sort decomposition dominates the gcc decomposition, consider X = ({1, 2}) 
and Y = ({0,1,2}) where in T){Yq) is inconsistent and therefore X <m Y is not GAC. We 
have SX = ({1,2}) and SY = ({0,1,2}) by GAC{sorted{X , SX)) and GAC{sorted{Y , SY)), 
and OX = ({0,1},{0,1},{0}) and OY = ({0, 1}, {0, 1}, {0, 1}) by GAC{gcc{X , {2,1,0), OX)) 
and GAC{gcc{Y , (2,1,0), OY)). To achieve GAC{SX <i^^ SY), in V{SYo) is pruned. This 
leads to the pruning of also from 'D(Yq) so as to establish GAC{sorted(Y , SY)). On the other 
hand, we have GAC{OX <iex OY), in which case no value is pruned from any variable. QED. 

Theorem 25 The sort decomposition of X Y is strictly stronger than the gcc decomposition 

0fX<mY. 

Proof: The example in Theorem 1241 shows strictness. QED. 

Even though the sort decomposition of X <m Y is stronger than the gcc decomposition of 
X Y , GAG on X <m Y can lead to more pruning than any of the two decompositions. A 
similar argument holds also for X <m Y. Hence, it can be preferable to post and propagate 
multiset ordering constraints via our global constraints. 

8 Multiple Vectors 

We often have multiple multiset ordering constraints. For example, we post multiset ordering 
constraints on the rows or columns of a matrix of decision variables because we want to break 
row or column symmetry. We can treat such a problem as a single global ordering constraint over 
the whole matrix. Alternatively, we can decompose it into multiset ordering constraints between 
adjacent or all pairs of vectors. In this section, we demonstrate that such decompositions hinder 
constraint propagation. 

The following theorems hold for n vectors of m constrained variables. 

Theorem 26 GAC(Xi <m Xj) for aUO<i<j<n — 1 is strictly stronger than GAC(Xi <m 
Xi^i) for all < i < n — 1. 

Proof: GAC{Xi <m Xj) for ah < i < j < n - 1 is as strong as GAC(Xj Xj+i) for all 
< i < n — 1, because the former implies the latter. To show strictness, consider the following 
3 vectors: 

Xo = ({0,3}, {2}) 
Xi = ({0,1,2,3}, {0,1,2,3}) 
^2 = ({2,3}, {1}) 

We have GAC(Xj ^j+i) for all < i < 2. The assignment Xq^q <— 3 forces Xq to be (3, 2), 
and we have ceiling(X2) = (3,1). Since §3,2}} >m {{3,1}}, GAC(Xo <m X2) does not hold. 
QED. 

Theorem 27 GAC(Xi <m Xj) for allO<i<j<n — 1 is strictly stronger than GAC(Xi <m 
Xij^i) for all < i < n — 1. 

Proof: The example in Theorem 1261 shows strictness. QED. 

Theorem 28 GACfiij 0<i<j<n — 1. Xi <m Xj) is strictly stronger than GAG(Xi <m 
Xj) for all < i < j < n - 1. 
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Proof: GAC(Vij < i < j < n - 1. Xi <m Xj) is as strong as GAC{Xi Xj) for all 
0<i<J<?^ — 1, because the former implies the latter. To show strictness, consider the 
following 3 vectors: 

Xo = ({0,3}, {!}) 
Xi = ({0,2}, {0,1,2,3}) 
X2 = ({0,1}, {0,1,2,3}) 

We have GAC{Xi <m Xj) for all < i < j < 2. The assignment Xq^ <— 3 is supported 
by Xq <— (3,1), Xi ^ (2,3), and X2 ^ (1>3). In this case, Xi <m X2 is false. Therefore, 
GAC(Vf j < i < i < 2 . Xi<m Xj) does not hold. QED. 

Theorem 29 GACfiij 0<i<j<n — I. Xi <m Xj) is strictly stronger than GAC(Xi <m 
Xj) for all < i < j < n - 1. 

Proof: GAC(Vzj 0<i<j<n-l. Xi <m Xj) is as strong as GAC(Xj <„i Xj) for all 
0<z<j<n — 1, because the former implies the latter. To show strictness, consider the 
following 3 vectors: 

Xo = ({0,3}, {1}) 
Xi = ({1,3}, {0,1,3}) 
X2 = ({0,2}, {0,1,2,3}) 

We have GAC(Xj Xj) for all < i < j < 2. The assignment Xq^ <— 3 is supported 
by Xq <— (3,1), Xi (3,3), and X2 ^ (2,3). In this case, Xi X2 is false. Therefore, 
GAC(Vij < i < i < 2 . Xi <m Xj) does not hold. QED. 

9 Experiments 

We implemented our global constraints <m and <m in C++ using ILOG Solver 5.3 j ILO02] . 
Due the absence of the sorted constraint in Solver 5.3, the multiset ordering constraint is decom- 
posed via the gcc decomposition using the IloDistribute constraint. This constraint is the gcc 
constraint but it does not always prune completely the occurrence vectors as described before. 

In the experiments, we have a matrix of decision variables where the rows and/or columns 
are (partially) symmetric. To break the symmetry, we post multiset ordering constraints on the 
adjacent symmetric rows or columns, and address several questions in the context of looking for 
one solution or the optimal solution. First, does our filtering algorithm(s) do more inference in 
practice than its decomposition? Similarly, is the algorithm more efficient in practice than its 
decomposition? Second, is it feasible to post the arithmetic constraint? How does our algorithm 
compare to BC on the arithmetic constraint? Even though studying the effectiveness of the 
multiset ordering constraints in breaking symmetry is out of the scope of this paper, we provide 
experimental evidence of their value in symmetry breaking. 

We report experiments on three problem domains: the progressive party problem, the rack 
configuration problem, and the sport scheduling problem. The decisions made when modelling 
and solving a problem are tuned by our initial experimentation. The results are shown in tables 
where a "-" means no result is obtained in 1 hour (3600 sees). The best result of each entry in 
a table is typeset in bold. If posing an ordering constraint on the rows (resp. columns) is done 
via a technique called Tech then we write Tech R (resp. Tech C). The ordering constraints are 
enforced just between the adjacent rows and/or columns as we have found it not worthwhile to 
post them between all pairs. 

Finally, the hardware used for the experiments is a IGhz pentium III processor with 256Mb 
RAM running Windows XP. 
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Constraints: 






(1) Vil, J2,il < h G Quests . 'E^eVcr^odsi^'.jl = ^^.32) < 1 


(2) Vj G Quests . 


all-( 


lifferent{(Hoj,Hij, . . . , Hp-ij)) 


(3) Vi G "Periods . 


VA: G Wosis . J2j(=eueats 9Cj * Ci,j,k < Cfc - hck 


(4) Vi G Veriods . 


Vj G Snests . J2kenosts dij.k = 1 


(5) Vj G Veriods . 


Vj G Quests . Vfc G "Hosts . Hi j = A: <-» Cij^k = 1 



Figure 2: The matrix model of the progressive party problem in [SBHW96] . 
9.1 Progressive Party Problem 

The progressive party problem arises in the context of organising the social programme 
for a yachting rally (probOlS in CSPLib). We consider a variant of the problem proposed in 
|SBHW96] ■ There is a set Tiosts of host boats and a set Quests of guest boats. Each host boat 
i is characterised by a tuple {hci,Ci), where and hci is its crew size and q is its capacity; and 
each guest boat is described by gci giving its crew size. The problem is to assign hosts to guests 
over p time periods such that: 

• a guest crew never visits the same host twice; 

• no two guest crews meet more than once; 

• the spare capacity of each host boat, after accommodating its own crew, is not exceeded. 

A matrix model of this problem is given in [SBHW96]. It has a 2-d matrix H to represent 
the assignment of hosts to guests in time periods (see Figure [2]). The matrix H is indexed by the 
set Veriods of time periods and Guests, taking values from Hosts. The first constraint enforces 
that two guests can meet at most once by introducing a new set of 0/1 variables: 

Vi G Periods . Vji, j2, Ji < j2 S Quests . Mi jij2 = 1 ^ Hiji = Hij2 

The sum of these new variables are then constrained to be at most 1. The all- different constraints 
on the rows of this matrix ensure that no guest revisits a host. Additionally, a 3-d 0/1 matrix C 
of Veriods x Quests x Tiosts is used. A variable Cjj',fc in this new matrix is 1 iff the host boat 
k is visited by guest j in period i. Even though C replicates the information held in the 2-d 
matrix, it allows capacity constraints to be stated concisely. The sum constraints on C ensure 
that a guest is assigned to exactly one host on a time period. Finally, channelling constraints 
are used to link the variables of H and C. 

The time periods as well as the guests with equal crew size are indistinguishable. Hence, this 
model of the problem has partial row symmetry between the indistinguishable guests of H, and 
column symmetry. In the following we first show that multiset ordering constraints are useful 
in breaking index symmetry. 
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Table 1: Progressive party problem with row- wise labelling of H. 
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Table 2: Progressive party problem with column-wise labelling of H. 



To break the row and column symmetries, we can utilise both lexicographic ordering and 
multiset ordering constraints, as well as combine lexicographic ordering constraints in one di- 
mension of the matrix with multiset ordering constraints in the other. Due to the problem 
constraints, no pair of rows/columns can have equal assignments, but they can be equal when 
viewed as multisets. This gives us the models <iex^C, <mRC, <mR >mC, <m^ <iexC, <mR 
>iexC, <zea;R <m,C, and <iex^ >mC. As the matrix H has partial row symmetry, the ordering 
constraints on the rows are posted on only the symmetric rows. The ordering constraints on the 
columns are, however, posted on all the columns. 

In our experiments, we compare the models described above in contrast to the initial model 
of the problem in which no symmetry breaking ordering constraints are imposed. We consider 
the original instance of the progressive party problem described in |SBHW96] . with 5 and 6 time 
periods. As in |SBHW9"6| . we give priority to the largest crews, so the guest boats are ordered 
in descending order of their size. Also, when assigning a host to a guest, we try a value first 
which is most likely to succeed. We therefore order the host boats in descending order of their 
spare capacity. We adopt two static variable orderings, and instantiate H either along its rows 
from top to bottom, or along its columns from left to right. 

The results of the experiments are shown in Tables [T] and [21 With row-wise labelling of 
H, we cannot solve the problem with 6 time periods with or without the symmetry breaking 
ordering constraints. As for the other instance, whilst many of the models we have considered 
give significantly smaller search trees and shorter run-times, <mR.C and <iex^ cannot 
return an answer within an hour time limit. The smallest search tree and also the shortest 
solving time is obtained by <iex^ <mC, in which case the reduction in the search effort is 
noteworthy compared to the model in which no ordering constrains are imposed. This supports 
our conjecture that lexicographic ordering constraints in one dimension of a matrix combined 
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Table 3: Instance specification for the progressive party problem. 
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Table 4: Progressive party problem: MsetLeq vs gcc decomposition and the arithmetic constraint 
with row-wise labelling. 

with multiset ordering constraints in the other can break more symmetry than lexicographic 
ordering or multiset ordering constraints on both dimensions. 

Next, we show that our filtering algorithm is the best way to propagate multiset ordering 
constraints. To simplify the presentation, we address only the row symmetry. Given a set of 
indistinguishable guests {gi,gi+i, . . . ,gj}, we insist that the rows corresponding to such guests 
are multiset ordered: Ri <m Ri+i ■ ■ ■ <m Rj- We impose such constraints by either using our 
filtering algorithm MsetLeq, or the gcc decomposition, or the arithmetic constraint. 

We now consider several instances of the problem using the problem data given in CSPLib. 
We randomly select the host boats in such a way that the total spare capacity of the host boats 
is sufficient to accommodate all the guests. Table [3] shows the data. The last column of Table [3] 
gives the percentage of the total capacity used, which is a measure of constrainedness |Wal99] . 
We instantiate H row-wise following the same protocol described previously. 

The results of the experiments are shown in Table [H Note that all the problem instances 
are solved for 5 time periods. The results show that MsetLeq maintains a significant advantage 
over the gcc decomposition and the arithmetic constraint. The solutions to the instances, which 
can be solved within an hour limit, are found quicker and compared to the gcc decomposition 
with much less failures. Note that MsetLeq and the arithmetic constraint methods create the 
same search tree. 

9.2 Rack Configuration Problem 

The rack configuration problem consists of plugging a set of electronic cards into racks with 
electronic connectors (prob031 in CSPLib). Each card is a certain card type. A card type i in 
the set Ctypes is characterised by a tuple {cpi,di), where cpi is the power it requires, and di is 
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Figure 3: The matrix model of the rack configm'ation problem in |ILQ02j . 

the demand, which designates how many cards of that type have to be plugged. In order to plug 
a card into a rack, the rack needs to be assigned a rack model. 

Each rack model i in the set TZackModels is characterised by a tuple {rpi,Ci,Si), where rpi 
is the maximal power it can supply, Cj is its number of connectors, and Si is its price. Each card 
plugged into a rack uses a connector. The problem is to decide how many among the set TZacks 
of available racks are needed, and which model the racks are in order to plug all the cards such 
that: 

• the number of cards plugged into a rack does not exceed its number of connectors; 

• the total power of the cards plugged into a rack does not exceed its power; 

• all the cards are plugged into some rack; 

• the total price of the racks is minimised. 

A matrix model of this problem is given in [ILQ02] and shown in Figure [3l The idea is to 
assign a rack model to every available rack. Since some of the racks might not be needed in 
an optimal solution, a "dummy" rack model is introduced (i.e., a rack is assigned the dummy 
rack model when the rack is not needed). Furthermore, for every available rack, the number of 
cards of a particular card type plugged into the rack has to be determined. The assignment of 
rack models to racks is represented by a 1-d matrix R, indexed by TZacks, taking values from 
TZackModels which includes the dummy rack model. In order to represent the number of cards 
of a particular card type plugged into a particular rack, a 2-d matrix C of Ctypes x TZacks is 
introduced. A variable in this matrix takes values from {0, . . . ,maxConn} where maxConn is 
the maximum number of cards that can be plugged into any rack. 

The dummy rack model is defined as a rack model where the maximal power it can supply, its 
number of connectors, and its price are all set to 0. The constraints enforce that the connector 
and the power capacity of each rack is not exceeded and every card type meets its demand. The 
objective is then to minimise the total cost of the racks. 

The 2-d matrix C has partial row symmetry, because racks of the same rack model are indis- 
tinguishable and therefore their card assignments can be interchanged. To break this symmetry, 
we post multiset ordering constraints on the rows conditionally. Given two racks i and j, we 
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Power 
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Table 5: Rack model and card type specifications in the rack configuration problem |ILO02j . 
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Table 6: Demand specification for the cards in the rack configuration problem. 
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Table 7: Rack configuration problem: MsetLeq vs the arithmetic constraint. 

enforce that the rows corresponding to such racks are multiset ordered if the racks are assigned 
the same rack model. That is: 

Ri = Rj > {Co,ii • • • ) Cn—l,i) {C-QJi • • • > C'n— 

where n is the number of card types. We impose such constraints by either using our filtering 
algorithm MsetLeq or the arithmetic constraint. Unfortunately, we are unable to compare 
MsetLeq against the gcc decomposition in this problem, as Solver 5.3 does not allow us to post 
IloDistribute constraint conditionally. 

We consider several instances of the rack configuration problem, which are described in 
Tables [5] and EJ In the experiments, we use the rack model and card type specifications given 
in |ILO02j . but we vary the demand of the card types randomly. As in |ILQ02j . we search for 
the optimal solution by exploring the racks in turn. For each rack, we first instantiate its model 
and then determine how many cards from each card type are plugged into the rack. 

The results of the experiments are shown in Table [71 MsetLeq is clearly much more efficient 
than the arithmetic constraint on every instance considered. Note that the two methods create 
the same search tree. 



35 



n 


Model 


Fails 


Choice points 


lime (sec. J 


5 


MsetLess C 


1 


10 


0.8 




Arithmetic Constraint C 


1 


10 


0.9 




gcc C 


2 


11 


1.2 


7 


MsetLess C 


69 


87 


0.8 




Arithmetic Constraint C 


69 


87 


1.3 




gcc 




no 


1.3 


9 


MsetLess C 


760,973 


761,003 


121.3 




Arithmetic Constraint C 


760,973 


761,003 


2500 




(?cc C 
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Table 8: Sport scheduling problem: MsetLess vs gcc decomposition and the arithmetic con- 
straint with column-wise labelling. For one column, we first label the first slots; for the other, 
we first label the second slots. 

9.3 Sport Scheduling Problem 

This problem was introduced in Section [2j Figure [1] shows a matrix model. The (extended) 
weeks over which the tournament is held, as well the periods of a week are indistinguishable. 
The rows and the columns of T and G are therefore symmetric. Note that we treat T as a 2-d 
matrix where the rows represent the periods and columns represent the (extended) weeks, and 
each entry of the matrix is a pair of values. The global cardinality constraints posted on the 
rows of T ensure that each of 1 . . . n occur exactly twice in every row. In any solution to the 
problem, the rows when viewed as multisets are therefore equal. The all-different constraints 
posted on the columns state that each column is a permutation of 1 ... n. Thus, the columns are 
also equal when viewed as multisets. Therefore, we cannot utilise multiset ordering constraints 
to break row and/or column symmetry of this model of the problem. 

Scheduling a tournament between n teams means arranging n(n — l)/2 games. The model 
described in Figure [1] assumes n is an even number. If n is an odd number instead, then we 
can still schedule n(n — l)/2 games provided that the games are played over n weeks and each 
week is divided into (n — l)/2 periods. The problem now requires that each team plays at most 
once a week, and every team plays exactly twice in the same period over the tournament. This 
version of the problem can be modelled using the original model in Figure [H as the all-different 
constraints on the rows and the cardinality constraints on the columns enforce the new problem 
constraints. 

We can now post multiset ordering constraints on the columns of T to break column sym- 
metry. Since the games are all different, no pair of columns can be equal, when viewed as 
multisets. Hence, we insist that the columns corresponding to the n weeks are strict multiset 
ordered: Co <m Ci . . . <m Cn-i- We enforce such constraints by either using our filtering 
algorithm MsetLess, or the gcc decomposition, or the arithmetic constraint. Since the multiset 
ordering constraints are posted on the columns, we instantiate T column- by-column. For one 
column, we first label the first slots; for the other, we first label the second slots. The results 
are shown in Table El 

We observe that MsetLess is superior to the gcc decomposition. As the problem gets more 
difficult, MsetLess does more pruning and solves the problem quicker. The results moreover 
indicate a substantial gain in efficiency by using MsetLess in preference to the arithmetic con- 
straint. Even though the same search tree is created by the two, constructing and propagating 
the arithmetic constraints is much more costly than running MsetLess to solve the multiset 
ordering constraints. 
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10 Conclusions 



We have developed filtering algorithms for the multiset ordering (global) constraint X <m Y 
defined on a pair of vectors of variables. It ensures that the values taken by the vectors X 
and y, when viewed as multisets, are ordered. This global constraint is useful for breaking row 
and column symmetries of a matrix model and when searching for leximin solutions in fuzzy 
constraints. The filtering algorithms either prove that X <m Y is disentailed, or ensure GAC 
on X <m Y. 

The first algorithm MsetLeq is useful when d <^ n and runs in 0(n) where n is the length 
of the vectors and d is the number of distinct values. This is often the case as the number of 
distinct values in a multiset is typically less than its cardinality to permit repetition. We further 
proposed another variant of the algorithm suitable when d ^ n. This identifies support by 
lexicographically ordering suitable sorted vectors. The complexity is then independent of the 
number of distinct values and is 0{n log{n)), as the cost of sorting dominates. We also have 
shown that MsetLeq can easily be modified for X <m Y by changing the definition of one of 
the flags. Moreover, the ease of maintaining the occurrence vectors incrementally helps detect 
entailment in a simple and dual manner to detecting disentailment. 

Our experiments on the the progressive party problem, the rack configuration problem, and 
and the sport scheduling problem support the usefulness of multiset ordering constraints in the 
context of symmetry breaking and support our theoretical studies: even if it is feasible to post 
the arithmetic constraint, it is much more efficient to propagate the multiset ordering constraint 
using our filtering algorithm; furthermore, decomposing the multiset constraint carries penalty 
either in the amount or the cost of constraint propagation. 

In our future work, we plan to investigate whether the incremental cost for propagation can 
be made less than linear time. Moreover, we plan to understand whether it is worthwhile to a 
propate a chain of multiset ordering constraints and if that is the case devise an efficient filtering 
algorithm. 
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